Re: Bug: inconsistent escaping of coderef regexp

2021-04-07 Thread Tom Gillespie
Hi Nicolas,
I've included the simplest patch I could come up with for the
divergence in behavior between org-babel-tangle-single-file and
org-link-search. I think there are two new threads that I need to
create. One is related to how to make it possible to specify what
should be removed along with the coderef (i.e. coderef prefix), the
other is the addition of header arguments that provide the same
functionality as switches. Best,
Tom

> This is already conflating the two. I'd like to solve the issue at hand
> without having header args interfere at all.
>
> This can happen later, after a discussion on the ML.

Ok. I've included the simplest version of the fix, which is to use
org-src-coderef-regexp in org-babel-tangle-single-file.

> Would you mind answering my questions first? I still don't follow you
> about the coderef prefix/regexp.

https://code.orgmode.org/bzg/org-mode/src/2d78ea57cfad1ddc3e993c949daf117b76315170/lisp/org-src.el#L882

That line defines a hardcoded regular expression for matching
coderefs. The codref prefix is the first =[ \t]*= and the coderef
regexp is the equivalent to the fully formatted version of that format
string. Neither of those can currently be specified by the user. The
user should not be able to specify the coderef regexp due to the fact
that it is too easy to specify a regexp that will not work correctly
and because the format string is needed to make org-link-search work
for named coderefs (otherwise you wind up trying to replace .+ in the
coderef regexp which is a nightmare). The coderef prefix is something
that should probably be configurable by the user so that empty
comments are not left in the file. I also looked into detecting the
comment character for the language in question, but that is
significantly more difficult even using (with-temp-buffer (funcall
lang-mode) comment-start) because not all languages have sane comment
start values and comment-start is not complete, so we would need a way
to manually specify what to exclude anyway.
From c30913da6b1c8d6be3670a59ae867df019505af3 Mon Sep 17 00:00:00 2001
From: Tom Gillespie 
Date: Wed, 7 Apr 2021 12:29:01 -0700
Subject: [PATCH] lisp/ob-tangle.el: Fix coderef removal during tangling

* lisp/ob-tangle.el (orb-babel-tangle-single-block): Regularize
behavior when removing coderefs during tangling. This fixes an issue
where trailing whitespace would be retained when coderefs were removed
for tangling.
---
 lisp/ob-tangle.el | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/lisp/ob-tangle.el b/lisp/ob-tangle.el
index aa0373ab8..4c0c3132d 100644
--- a/lisp/ob-tangle.el
+++ b/lisp/ob-tangle.el
@@ -414,9 +414,8 @@ non-nil, return the full association list to be used by
 	 (src-lang (nth 0 info))
 	 (params (nth 2 info))
 	 (extra (nth 3 info))
-	 (cref-fmt (or (and (string-match "-l \"\\(.+\\)\"" extra)
-			(match-string 1 extra))
-		   org-coderef-label-format))
+ (coderef (nth 6 info))
+	 (cref-regexp (org-src-coderef-regexp coderef))
 	 (link (let ((l (org-no-properties (org-store-link nil
  (and (string-match org-link-bracket-re l)
   (match-string 1 l
@@ -445,8 +444,7 @@ non-nil, return the full association list to be used by
 	(funcall assignments-cmd params))
 	  (when (string-match "-r" extra)
 		(goto-char (point-min))
-		(while (re-search-forward
-			(replace-regexp-in-string "%s" ".+" cref-fmt) nil t)
+		(while (re-search-forward cref-regexp nil t)
 		  (replace-match "")))
 	  (run-hooks 'org-babel-tangle-body-hook)
 	  (buffer-string
-- 
2.26.3



Re: [Patch] to correctly sort the items with emphasis marks in a list

2021-04-19 Thread Tom Gillespie
Hi Greg,
seq cannot be used because it is not available in older versions
of emacs that org still supports. When support for those older
versions is dropped then seq could be used. Best,
Tom



Re: Concerns about community contributor support

2021-04-20 Thread Tom Gillespie
Hi Tim, David, and Gustav,
I am fairly certain that with only a few exceptions it is possible
to specify a context free grammar for org syntax, followed by a second
pass that deals specifically with markup and a few other forms,
notably the reassembly of things like plain lists. The fact that this
is possible because most org constructs are line oriented.

Just a note that the linked parser.rkt [0] is indeed a BNF describing org
syntax in the same style as a bison/yacc grammar. One of the reasons
why I set out to work on this was precisely so that there could be a
reference that could be consulted by the community when questions
about extended org come up.

There are proposals for new syntax that appear on this list with
terrifying frequency, and they are routinely shot down or simply
ignored for good reason, however it is hard to communicate that to
enthusiastic contributors who have an immediate use case that they
want to solve and share and are unlikely to be aware of side effects.
Having a grammar where such issues can be tested empirically should
provide a significant safeguard while also making it easier for
contributors to play with the grammar and see the issues.

In all my work on the grammar I have found maybe 2 or 3 places where
the grammar could be "extended" but it isn't so much extended as it is
regularized, where some parts of org already parse a more complex
grammar while other very similar parts choose not to. Overall the cost
of not parsing certain forms in certain situations adds complexity
rather than reducing it.

The situation for contribution is also further complicated by the fact
that the elisp implementation of org mode is internally inconsistent
in its behavior with regard to the syntax, so great care has to be
taken if someone tries to make and argument based on the behavior of
one component.

All this to say that the need for a conservative approach to changes
and extensions combined with the internally inconsistent behavior of
different parts of the elisp implementation means that the
introduction of new features is extremely difficult because it is hard
to predict the consequences on other parts of org.

Overcoming this is why I started working on the grammar, because
in the absence of a formal spec for what org should do, it is very hard
to make changes to what it is currently doing without having nasty
side effects.

Best!
Tom

0. https://github.com/tgbugs/laundry/blob/next/laundry/parser.rkt note
the upcoming path change (which I will note in the original thread when
it happens).

PS I'm planning to reply to the main thread as well. My short take is
finding a dedicated and responsive maintainer that can take over from
Bastien is a high priority. The only other thing that might help is to
have some way to track outstanding and closed patches, issues, etc.
that is more accessible than trolling through years worth of posts on
this mailing list, but that is a can of worms that has already been
shot down multiple times.



Re: [PATCH] ob-tangle.el: Speed up tangling

2021-04-20 Thread Tom Gillespie
Hi Sébastien,
The temp -> rename approach is good, but you should probably use
make-temp-file to create the file to reduce the risk of
collisions/race conditions. For example as (make-temp-file (concat
file-name ".tangling")).

I think that the location of condition-case is ok, but I wonder what
would happen if something were to fail before entering that? I think
that only a subset of the files would be tangled, but they would all
have their correct modes, so I think that that is ok.

I also think that the message to the user should probably not be
changed right now. While it might can be useful for debug, if someone
is tangling to a large number of files then the filenames/paths are
going to flood messages, so I would leave it out of this patch, and
possibly submit it as another patch for a separate discussion.

Best!
Tom



Re: [PATCH] ob-tangle.el: Speed up tangling

2021-04-18 Thread Tom Gillespie
Hi Sébastien,
   Some comments while looking over this (will report back when I have
tested it out as well). This is a section of the ob export
functionality that I have been looking for on and off for quite a
while because it is responsible for some bad and insecure behavior. I
think that some of your changes may have fixed/improved this as a side
effect. I don't know whether it is worth doing anything about the
issues in this patch, but since we are here, I think they are worth
mentioning. All of the issues that I'm aware of are related to what
happens if tangling fails part way through the process. First, your
patch already fixes a major issue which is that the modes of all files
would not be set if any one of them failed to tangle. Next, during the
process the existing file is deleted prior to tangling, which means
that it cannot be restored if tangling fails, it would be better if
the old file was moved to a temporary location and then deleted on
success or replaced on failure. This likely requires wrapping the bits
that can fail in unwind-protect and restoring on failure or fully
deleting at the end of success. The next issue is that setting the
tangle mode should happen before the file is written, an empty file
should be created, the mode should then be set, the contents of the
file should be written only after the mode has been set. This involves
a bit of reordering of operations in lines 124-126 of your patch. This
ordering of opertions prevents security issues related to race
conditions and potential errors being evoked during write-region
(though again, your changes already make the tangling code much more
secure by setting the modes on each file immediately after writing
instead of how it works currently where if any other block encounters
an error then no modes were set). Best!
Tom

On Sun, Apr 18, 2021 at 12:23 AM Sébastien Miquel
 wrote:
>
> Hi,
>
> The attached patch modifies the ~org-babel-tangle~ function to avoid a
> quadratic behavior in the number of blocks tangled to a single file.
>
> Tangling an org buffer with 200 blocks to 5 different files yields a
> 25 % speedup.
>
>
> * lisp/ob-tangle.el (org-babel-tangle-collect-blocks): Group
> collected blocks by tangled file name.
> (org-babel-tangle): Avoid quadratic behavior in number of blocks.
>
> --
> Sébastien Miquel



Re: Properties on buffer level

2021-02-12 Thread Tom Gillespie
You should be able to run C-c C-c on #+property: directives before the
first headline and they will be updated without reloading the buffer.
Best,
Tom



Bug: doc string for "org-end-of-meta-data"

2021-09-15 Thread Tom Davey
Hello everybody,

I believe the last paragraph of the doc string for the function
"org-end-of-meta-data" contains an error. That one-sentence paragraph
currently reads:

When FULL is non-nil but not t, skip planning information, 
clocking lines and only non-regular drawers, i.e. properties 
and logbook drawers.

I believe that should be "regular drawers," not "non-regular drawers." IMO,
the last paragraph could be clearer were it rewritten as follows:

   When FULL is non-nil but not t, skip only planning information, 
   clocking lines and regular drawers, i.e. properties and logbook 
   drawers. If any non-regular drawers exist and do not follow the 
   two regular drawers, stop at the first non-regular drawer instead.

I believe that this expansion of the paragraph corrects the error and adds
coverage of a rare case.

Many thanks to all the developers of Org-mode.  

--
Tom Davey
t...@tomdavey.com
New York NY USA





Re: [org-cite] citations in property drawers?

2021-09-15 Thread Tom Gillespie
> That would be a terrible idea. Exporters are not required to handle all
> data contained in properties drawers, so this may introduce errors,
> e.g., when trying to number citations.

I agree completely. You can't export something that has no anchor in
text that would be rendered. Maybe I misunderstood the original
question, because there is no way that a citation or footnote could be
exported from there, so I think in your conception text that follows
the format of the citations or footnotes isn't actually a citation or
footnote unless it exports as such.

Best,
Tom



RE: Bug: doc string for "org-end-of-meta-data"

2021-09-15 Thread Tom Davey
Hi Marco, 

You make sense. What you propose to substitute is easier to understand and
concise: 

 When FULL is non-nil but not t, skip planning information, 
 properties, clocking lines and logbook drawers.

Thank you! 

--
Tom Davey
t...@tomdavey.com
New York NY USA

-Original Message-
From: Marco Wahl  
Sent: Wednesday, September 15, 2021 5:04 PM
To: Tom Davey 
Cc: 'emacs-org list' 
Subject: Re: Bug: doc string for "org-end-of-meta-data"

Hello Tom,

> I believe the last paragraph of the doc string for the function 
> "org-end-of-meta-data" contains an error. That one-sentence paragraph 
> currently reads:
>
> When FULL is non-nil but not t, skip planning information, 
> clocking lines and only non-regular drawers, i.e. properties 
> and logbook drawers.
>
> I believe that should be "regular drawers," not "non-regular drawers." 
> IMO, the last paragraph could be clearer were it rewritten as follows:
>
>When FULL is non-nil but not t, skip only planning information, 
>clocking lines and regular drawers, i.e. properties and logbook 
>drawers. If any non-regular drawers exist and do not follow the 
>two regular drawers, stop at the first non-regular drawer instead.
>
> I believe that this expansion of the paragraph corrects the error and 
> adds coverage of a rare case.

I think the use of the word "regular" is not a good idea in their
documentation of org-end-of-meta-data.  I could not find any occurance of
the term "regular drawer" in the org-info manual.  There is a section where
the property drawer is called "special".

In conclusion I'd say that the logic of the recent documentation is okay
with "regular" meaning "non-special".

Finally I propose to remove completely the categorisation due to "regular"
from the documentation.  Which reads:

 When FULL is non-nil but not t, skip planning information, 
 properties, clocking lines and logbook drawers.

WDYT?




Re: [org-cite] citations in property drawers?

2021-09-16 Thread Tom Gillespie
> I understand the problem, but the solution should not be: "let's pretend
> export does not exist".

>From my perspective any org object that is not in a section that
allows org objects could in principle be parsed as such, but it would
not be in the core of the grammar, and it also would have to parse to
something that did not trigger side effects related to export.

Allowing org objects to appear at arbitrary places in the grammar is
definitely not a good idea because in many senses they cannot actually
be those objects. Maybe the syntax could be the same, but they would
have to be "shadow objects" or something like that?

Best,
Tom



Re: [org-cite] citations in property drawers?

2021-09-14 Thread Tom Gillespie
Hi Bruce,
I could certainly imagine using it, and I don't see any issue with
doing it from the point of view of the grammar. Footnotes can appear
in a property drawer without issue, though obviously they don't
export. One question though since I may have missed this in the other
threads is cite: allowed without the square brackets? Either way, org
element just parses the value to a string and it is up to any
consuming application to parse the node property further. Best!
Tom

On Thu, Sep 9, 2021 at 11:45 AM Bruce D'Arcus  wrote:
>
> Just bumping this.
>
> Another question about where to allow cite elements.
>
> On Fri, Aug 20, 2021 at 4:18 PM Bruce D'Arcus  wrote:
> >
> > So this is a tentative request/question; I'm not really sure the best
> > approach here.
> >
> > This is based on discussion with one of the org-roam-bibtex developers
> > about what the proper way to indicate an org-roam note is a
> > bibliographic note; e.g. a note about a bibliographic source.
> >
> > Traditionally in org-roam, that is in a property drawer; like:
> >
> > :ROAM_REFS: cite:wallace-wells2019
> >
> > That is using org-ref syntax there.
> >
> > So the obvious question is should one just put an org-cite citation
> > there to do the same thing?
> >
> > Right now, the answer is clearly no, since they aren't allowed in
> > property drawers.
> >
> > But perhaps they should be, just as any link can be?
> >
> > Except if they are, I recognize, they need to be treated as special
> > cases; e.g ignored for the purposes of export and such.
> >
> > WDYT?
> >
> > Bruce
>



Re: A requires/provides approach to linking source code blocks

2021-07-13 Thread Tom Gillespie
We have been receiving many new feature suggestions and requests
coming in for org babel. I think that Tim's suggestion is the right
one. Nearly all of these need to be implemented as an extension first
and tested independently. Further, even if this is done, it should be
clear that there is zero expectation that such extensions will be
incorporated.

Once I wrap up the formal grammar for org, one of the next things I
plan to work on is a clear specification for org babel. This is
critical because so many of the suggestions that come in deal with
individuals' specific problems and thus fail to account for how such
features interact with existing features and how the newly proposed
feature would block some other features in the future, confuse users,
etc. Such suggestions also often fail to account for increased
complexity, nor have they been exposed to a sufficient number of
examples to reveal fundamental ambiguities in how they could be
interpreted. The issues with variable behavior between ob langs for
:pre :post :prologue :epilogue etc. are already enough to keep us busy
for quite some time.

With regard to this thread in particular, it is of some interest, but
there are fundamental issues, including the fact that certain
languages (e.g. racket) expect module code to exist somewhere on the
file system. There are ways around many of these issues, in fact there
are likely many ways around any individual issue, so org babel needs
to systematically consider the issues and provide a clear
specification, or at least a guide for how such cases should be
handled.

To give an example from one of my continual pain points: I start
writing python or racket in an org src block and then I want it to be
a library so that it can be reused by other code both inside and
outside the org file without having to resort to noweb.

What is the best way to handle this? I don't know. Right now I tangle
things and resort to awful hacks for the reuse-in-this-org-file case, but
I'm guessing there is a better generic solution which would allow _any_
org block to be exported as a library instead of nowebbed in.

Before jumping for any particular suggestion for how to handle this
we need to explore the diversity of cases that various ob langs
present, so that we can find a solution that will work for all of
them. After all, packaging code to a library for reuse is an
extremely common pattern that org babel should be able to
abstract, but it is a major undertaking, not just the addition of a
keyword here and there.

In short I suggest that we issue a general moratorium on new org babel
feature suggestions until we can stabilize what we already have and
provide a clear specification for correct behavior. Until we have that spec
we could encourage users to create extensions that implement those
features.

Best,
Tom


PS The other next thing that I am working on might be another way out
for this particular feature request. Namely, it is simplifying and
extending org keyword syntax so that new keywords (with options) and
associated keywords can be specified using keyword syntax within a
single org file. This would make it possible to get useful high level
keyword behavior in a single file without burdening the core
implementation with more special cases for associated keywords, and it
would allow users to write small elisp functions that could do some of
what is suggested here, all without need to add anything to the core
org implementation.



Re: [PATCH] Rename headline to heading

2021-08-08 Thread Tom Gillespie
Hi André,
Thanks for taking a first pass at this. I think that this patch is
difficult to review. Could you break it into two separate patches, one
for documentation (non-code, e.g. docstring and comment) changes and
one for code changes?  That way we could more easily see where we may
need to mitigate the kind of issues Maxim noticed. Best!
Tom



Re: bug: Error handling in source blocks.

2021-08-10 Thread Tom Gillespie
I will also chime in here to say that managing output streams and
errors for babel is a major new feature that I am interested in. The
issue, as Tim points out, is that there is a lot of complexity lurking
here due to the fact that certain languages have fundamentally
different capabilities and ways of handling or not handling errors,
and of running code (on arbitrary hosts) in the first place.

What works for one will almost certainly not work for another. Take
for example ob-lisp where there is already built in error handling in
emacs itself. Compare that with python where someone would likely need
to implement a special PYTHONBREAKPOINT entrypoint or something like
that, if it were possible at all.

I have had a draft of a document on what I called "babel
regularization" for well over a year now, but it is not in a state
that would be productive to share due to the sheer number of ob-langs
and systems affected and the need to be able to clearly catalog and
articulate the diversity of existing behaviors.

If you dig through old conversations on this list you will find a
discussion of the default behavior for ob-shell :returns values vs
output as the default, we were barely able to agree on which
principles should be followed to make the decision. In that case we
were lucky that there was already a way for users to set their desired
behavior in their init file or in a setup file or in the file itself.
How to handle errors will be much more complex, in part because it
will touch on what ob-lang implementations are able to overwrite
and/or must provide in order to actually function. At the moment there
are practically no constraints.

Lots of work to do here, so grateful for a report on the variability
in the behavior of the existing system.

Best!
Tom



Re: [Concept talk] Org-connector

2021-08-10 Thread Tom Gillespie
Hi Sébastien,
I think you are probably looking for org-sync which implements
exactly this functionality. You would need to write a new backend for
your particular ticketing system, but github, bit bucket, and redmine
backends already exist and can serve as an example. Best,
Tom

https://orgmode.org/worg/org-contrib/gsoc2012/student-projects/org-sync/tutorial/



Re: Expanding how the new cite syntax is used to include cross-references - thoughts?

2021-08-10 Thread Tom Gillespie
In general I like John's suggestion. Org link syntax can be made to do
nearly anything because it is possible to bind link actions to
arbitrary elisp functions (I have used them to create buttons that run
source blocks for some of my non-technical colleagues). The grouping
of cross references under org-cite seems reasonable to me, and I would
love it if they could handle arbitrary references, e.g. to hypothesis
web annotation links or org-capture links.

Actually, having written this now, I think that both solutions have
their own use cases. Org cite is clearly about providing evidence for,
or a scholarly reference for something, and critically it can embed
some metadata about that reference in the document as a citation or
perhaps as an excerpt (and extension of what org-ref does now when the
cursor is over a reference?). Regular links do not provide any way to
embed metadata within the document, they are purely pointers.

That being said, it seems that there are a number of use cases where
org-ref links are simply internal document links that can point to an
element with a specific #+name: and no embedded information about the
target is needed. However, I think it would be a mistake to use up
equation/eq and table/tbl or figure/fig prefixes for references that
are internal to org, because it implicitly limits/collides with the
#+link: keyword.

Best,
Tom



[PATCH] lisp/ox-html.el: Restore org-svg class

2021-07-30 Thread Tom Gillespie
Hi,
   This patch restores the addition of class="org-svg" to svg images
during html export. Best!
Tom
From 4363eec0913ccd0d05ecf3d6346208c62d3597f8 Mon Sep 17 00:00:00 2001
From: Tom Gillespie 
Date: Fri, 30 Jul 2021 20:53:07 -0700
Subject: [PATCH] lisp/ox-html.el: Restore org-svg class.

* lisp/ox-html.el (org-html--format-image): Restore org-svg class.
d96e8975791bd3b1a5f8fdb75609d73f134dc831 removed the org-svg class
which is necessary even when using  tags otherwise svg images
will render at absurdly large sizes.
---
 lisp/ox-html.el | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/lisp/ox-html.el b/lisp/ox-html.el
index bd6771a76..f25a9731e 100644
--- a/lisp/ox-html.el
+++ b/lisp/ox-html.el
@@ -1707,7 +1707,9 @@ a communication channel."
 (org-html-encode-plain-text
  (org-find-text-property-in-string 'org-latex-src source))
   (file-name-nondirectory source)))
- attributes))
+ (if (string= "svg" (file-name-extension source))
+ (org-combine-plists '(:class "org-svg") attributes '(:fallback nil))
+   attributes)))
info))
 
 (defun org-html--textarea-block (element)
-- 
2.31.1



Re: Help requested: Support for basic Org mode support in tools outside of Emacs

2021-08-03 Thread Tom Gillespie
Hi Karl,
   Great initiative. For many of the things in the table you will
probably want to link to the underlying library For example for github
and gitlab there is https://github.com/wallyqs/org-ruby (which I have
been trying to find time to submit fixes to). I've linked a couple
relevant threads and repos. Best!
Tom

python https://github.com/novoid/Memacs
python https://github.com/karlicoss/orgparse
python https://github.com/bjonnh/PyOrgMode
racket https://github.com/tgbugs/laundry/tree/next
racket https://github.com/jeapostrophe/org-mode
racket https://github.com/antoineB/org-mode
See https://github.com/tgbugs/laundry/blob/next/laundry/cursed.org for
an org file that github fails to render
clojure https://github.com/200ok-ch/org-parser/blob/master/resources/org.ebnf

https://orgmode.org/list/ca+g3_pobab1qx1zv8q9sjfh4khuhvmanxp3xo7__6eosdxk...@mail.gmail.com/
https://orgmode.org/list/ca+g3_pnj6pekqv+twfkwbd778xhw9wsfx+kjjhjsoreplhu...@mail.gmail.com/

On Tue, Aug 3, 2021 at 11:46 AM Greg Minshall  wrote:
>
> Karl,
>
> orgtbl-query is a script for querying tables in .org files.  it doesn't
> do any special text formatting.
>
> https://gitlab.com/minshall/orqtbl-query
>
> cheers, Greg
>



Re: [PATCH] lisp/ox-html.el: Restore org-svg class

2021-09-21 Thread Tom Gillespie
Bumping this patch for 9.5.

On Fri, Jul 30, 2021 at 8:59 PM Tom Gillespie  wrote:
>
> Hi,
>This patch restores the addition of class="org-svg" to svg images
> during html export. Best!
> Tom



Re: Org lint and named source blocks

2021-09-21 Thread Tom Gillespie
> Should we allow syntax like #+KEYWORD:value to be correct or do we
> require a whitespace/space after colon all the time?

The spec as written is ambiguous/silent on this issue. In my work on
laundry tokenizer and grammar I have found keyword syntax to be a
thorny issue, and I strongly suggest that for the time being we either
make no ruling on this or we state that the colon that ends the
keyword should be followed by a space as a precautionary measure.
The safe thing to do is to always require whitespace after the colon
because it guarantees correct interpretation.

Requiring whitespace after the colon simplifies the grammar, however
it means that you can't compact keyword lines, and it induces an
annoying failure mode where missing spaces are no longer keywords.

However, it is technically possible to make keywords work without the
whitespace, so long as there is at least one whitespace prior to the
next colon (but not contained in square brackets, e.g. #+key:lol[ a b
c ]:value is a well formed keyword under a slighly generalized
grammar). The problem is that we would like to make keyword syntax
fully closed, and I need a bit more time to get that worked out before
any definitive conclusions are drawn.

The complexity of the generalized keyword syntax can be seen here
https://github.com/tgbugs/laundry/blob/5a396bef98d9a3cd9ee929f21cd47612dd6cb1ac/laundry/lex-abbrev.rkt#L107-L249

Best,
Tom



Re: how to org-babel-detangle with nested noweb?

2021-10-18 Thread Tom Gillespie
Hi Edgar,
Degangling of nested noweb blocks tangled using
:comments noweb is broken at the moment. There are
some deep bugs that need to be worked out, and last
time I looked at the code I think my conclusion that it
was better to do a complete rewrite starting from a new
specification of the behavior along with some gnarly test
cases to ensure that everything works as expected.
Best!
Tom



Re: Empty headline titles unsupported: Bug?

2021-09-26 Thread Tom Gillespie
Hi Bastien,
I am strongly in favor of this change. It simplifies the grammar
significantly, and from my work on the laundry lexer and parser, I'm
99% certain that the current behavior is a bug that is the result of
gobbling the space after the stars in the headline. The correct
implementation peeks 1 char ahead for the space, and then starts
parsing again starting with the space. This is because tags MUST be
preceded by a space, so if you incorrectly gobble the space after the
stars then that space cannot be used as the start for tags. Best,
Tom



Re: [PATCH] Accept more :tangle-mode specification forms

2021-09-30 Thread Tom Gillespie
I strongly oppose this patch. It adds far too much complexity to the
org grammar. Representation of numbers is an extremely nasty part of
nearly every language, and I suggest that org steer well clear of
trying to formalize this. With an eye to future portability I suggest
that no special cases be given to something as important for security
as tangle mode without very careful consideration. Emacs lisp closures
have clear semantics in Org and the number syntax is clear. If users
are concerned about the verbosity of (identity #o0600) they could go
with the sorter (or #o0600). Best,
Tom



Re: [PATCH] Don't fill displayed equations

2021-10-02 Thread Tom Gillespie
> do not see a reason for idiosyncrasy that markup intended to add LaTeX
> snippet that looks like exactly as LaTeX commands for this purpose and
> even actually preserved during export to LaTeX should have different
> semantics for Org parser.

The answer is that \[ \] can only occur inside paragraphs. The issues
here are exactly the same as the issues for inline footnotes. Org gives
us a bit more power, but not the full power because Org is Org, not
Latex. Making \[ \] available outside of a paragraph would be a massive
breaking change.

In Timothy's original example he is narrowly skirting the syntax to
allow that all to remain a single paragraph, but stick in a newline
anywhere and boom, no more paragraph, no more equation.

I guess one thing I'm missing/not understanding is when/why people
want to use \[ \] instead of full #+begin_export latex block?

Best,
Tom



Re: Comments break up a paragraph when writing one-setence-per-line

2021-10-02 Thread Tom Gillespie
A general comment (heh) here. This is not a bug and not easily fixed.
Line comments are their own top level element distinct from
paragraphs. If you need something that fits in a paragraph you can use
@@comment:@@ at the start of a line.

I agree that it is annoying, but Org line comment syntax also only
works if it starts the line, so the behavior diverges from traditional
code comments. It may make sense to update the docs to call them "line
comments" instead of just comments.

One area where we could almost certainly do better is in how line
comments break up the flow of text. I'm not sure there will ultimately
be much we can do about it, but it is worth investigating.

Best,
Tom



Re: [PATCH] Don't fill displayed equations

2021-10-02 Thread Tom Gillespie
Hi Timothy,

> │ \[
> │   not part of a paragraph
> │ \]

My point is that that parses first as a paragraph (check org-element-at-point).
\[ and \] would be meaningless if it did not first parse as a paragraph.

> I also don’t see how footnotes are analogous, as footnotes are placed in the
> middle of a line of text.

Inline footnotes [fn::
can span
multiple lines] but can't contain empty lines because the empty line ends the
paragraph that they are contained in.

> org-latex-preview :)

But surely #+begin_export latex works with org-latex-preview? If not then
that would be a feature request to org-latex-preview yes?

Best!
Tom



Re: [PATCH] Don't fill displayed equations

2021-10-03 Thread Tom Gillespie
Some thoughts.

> Maybe you are right and Tom was actually assuming \begin{equation*}, not
> #+begin_export latex.

Correct. My bad on that one.

> Just as Timothy, I believe that \begin{equation*} is unnecessary verbose
> when \[ works *mostly* in a similar way.

\begin{equation*} is absolutely required if you want to be able to include
newlines because \[ and \begin are not similar at all as far as parsing
is concerned.

>From the spec: https://orgmode.org/worg/dev/org-syntax.html#LaTeX_Environments
> CONTENTS can contain anything but the “\end{NAME}” string.
The spec is not completely accurate since latex environments can't
contain a new heading, but the point is that latex environments are
elements, whereas \[ \] is an object.

> If I understand correctly, making \[ \] available outside paragraph
> would mean that it becomes a new element (currently \[\] is a
> latex-fragment object).

Correct. Promoting \[ to an element would mean every \ in an org file
becomes a stop word. Also, Since full fledged latex environments
already exist to serve this purpose I find it hard to justify, especially
given that Org tries to give clear indication of when a block structure
is starting and ending.

> Isn't the whole point of the \[ ... \], \( ... \), $ ... $, $$ ... $$,
> and \begin{env} ... \end{env} and constructs in Org to be consistent
> with LaTeX?

For \begin and \end yes. For the others no. In general it would be to
make it possible to express things using latex-like syntax that would
otherwise require Org to come up with some new and different syntax.
These are values that may be translated to latex, but they exist inside
a larger syntax that is decidedly not latex, and thus they only have
meaningful translation to latex if they exist as well formed Org.

As a side note, the $ syntax is slated to be deprecated and removed.
https://orgmode.org/worg/dev/org-syntax.html#Entities_and_LaTeX_Fragments
> It would introduce incompatibilities with previous Org versions, but
> support for $...$ (and for symmetry, $$...$$) constructs ought to be removed.

> Indeed, it will be a breaking change.

I'm actually fairly certain that such a change should never be made
due to the recent changes in org link syntax. Specifically given how
\[ is used for escapes in links. https://orgmode.org/manual/Link-Format.html
This means that the only place you could reliably use \[ is at the start of a
new line preceded only by whitespace. However, if this were to happen then
pretty much every org document that uses \[ \] is at risk for being broken
because something that was once a single paragraph will now be multiple
paragraphs.

If you need multiline use \begin \end, that is what they are there for, and they
fit better with org's general extensible approach to blocks. I would dearly love
to be able to have a single shorthand for src blocks that worked inline and
standalone, but the complexity that it would induce is just not worth it. Same
thing for \[ \]. It seems simple until you get down to account for all the edge
that it would induce in the grammar.

Consider the case where you have something like

\[ something something

more content
more content [[www.example.com/\]oops][evil link]] \]

I've seen enough cases that are similar to this in the existing implementation
that have inconsistent behavior that I can safely say that this one would too.
Not to mention that I can think of at least 3 different cases that will all have
slightly different behavior that is inexplicable to users at best and
infuriating
at worst.

\[ a

b \]

\[
a
b
\]

a \[ b

c \] d

etc. There are plenty more variants that would all be subtly different depending
on the exact way such a thing were implemented.

In short. Just not worth it.



Re: [PATCH] Don't fill displayed equations

2021-10-04 Thread Tom Gillespie
> Does anybody have any other thoughts?

>From time to time I encounter random patterns that I don't want to be
reformatted during a fill operation. Maybe a custom variable like
org-fill-paragraph-skip-regexp or similar that could be set by the user?
For Timothy's use case he would set it to the regexp provided in the
original patch? Not sure how much of the implementation in the patch
is dependent on that particular regexp, but a general solution that
could even be set per org file might be a very useful new feature.

Best!
Tom



Re: Org lint and named source blocks

2021-10-04 Thread Tom Gillespie
> By the way, wouldn't it be better to use tree-sitter rather than
> something else for the format grammar?

Not really since we are going to need more than one implementation
using a parser generator to avoid baking implementation specific
details into the spec by accident. This is true for more than just
the grammar as well. The complexity of tokenization, parsing,
expanding, etc, for Org means that we are going to need multiple
implementations to nail the behavior for any formal spec.

That said, we definitely want a TS implementation at some point.
See https://github.com/tgbugs/laundry/issues/1 for a recent
discussion about ways forward.

The implementation I'm working on should translate to TS without
too much work since both brag and tree sitter describe LR variants.
There may be some subtle differences, but nothing fundamental.

The issue for me is that I don't have the bandwidth to get started
with a full tree sitter implementation, especially because it is going
to need a custom scanner, and because you're effectively on your
own when it comes to reconstructing the output of the AST into the
actual internal representation of an Org file. I also have no idea how
to deal with nested parsers in tree sitter. I have some ideas about
how it might be done, but nothing concrete (see the linked issue
for more on that).

Best,
Tom



Re: [PATCH] Accept more :tangle-mode specification forms

2021-10-01 Thread Tom Gillespie
> I'd like to understand these objections better. Aren't you overstating
what is at issue?

Yes, after hitting send I realized I overstated my position a bit.
In the meantime the comments in this thread are encouraging,
however I have finally figured out what I was really trying to say.

tl;dr file permission modes are not universal and should thus not
be part of the Org implementation, Org itself knows nothing about
files or permissions, it is the system that Org is running in/on.
Therefore, so long as we make it abundantly clear that the
value for :tangle-mode is not expected to be portable and that
it is always up to the user to ensure correct behavior, then we
are ok. I'm not happy about this conclusion from a security
perspective, but it isn't really worse than the situation we have
right now.


As many have pointed out, the grammar itself will not be affected.
However, other parts of the spec will. In general my objective is to
try to reduce the number of special cases that an org implementation
has to know about and delegate them to something else.

However in this case it is a bit tricky because of the security implications
and due to the fact that octal modes for file permissions are NOT universal
and should not be expected to be universal!

I actually think that my gut reaction was correct, but was expressed
in the wrong way.

Unix file modes are not universal and should thus not be encoded as
part of a portable document format. This means that it is up to the
user to know what representation is suitable.

Right now that representation is delegated to Emacs, because Emacs
handles file permissions for Org, and Emac's language for modes is
octal.

There are some octal modes that do not translate on Windows, and cannot
be correctly set. There will (hopefully) be some happy day in the future
where there is an operating system that will run Org babel where octal
file modes do not exist at all!

Therefore I suggest that we do not enshrine a particularly obscure way
of expressing file modes into Org itself. Right now Org is confined to
Emacs' representations, which in a sense protects Org from becoming
too ossified by bad designs of the past --- Emacs can keep all that
for us!

If we want a more user friendly syntax for this I would suggest that we do
something like what has been done for Org babel :results, i.e. like
:tangle-mode read write execute, unfortunately that does not compose
well at all with user, group, and other and becomes exceedingly verbose.


Final conclusion, after all that rambling, is that I'd actually be ok with
any of the solutions proposed, so long as it is clear that :tangle-mode
will always be implementation dependent, and may or may not be
meaningful depending on which operating system you are using.
Unfortunate for security, but I don't see any way around tha. The
best we could do for security would be for implementations to
test the file modes after tangling to ensure that they match,
which is more important I think.

That said, reducing the number of forms as Eric suggests would
be a happy medium.

Best!
Tom



Re: Org lint and named source blocks

2021-10-04 Thread Tom Gillespie
Thanks for the pointer! The actual point of contact seems to be
https://github.com/milisims/tree-sitter-org. Good to find another
group that is working on this. Best,
Tom



Re: On zero width spaces and Org syntax

2021-12-03 Thread Tom Gillespie
one, and probably
would already have been done, and I suspect people might use it.

There are very few syntax changes that reduce the complexity for
Org (though there are some). The rest have major costs, both in
implementation time, and in disruption of workflows, and hunting
down of edge cases, and total complexity.

The burden of proof for syntax changes lies squarely with the
individual(s) suggesting the change to show that it can be
done without disrupting the existing implementation and without
inducing complexity and changing the interpretation of existing
documents. I say this as someone who has at least one major
syntax change suggestion in the pipeline.

Requesting a syntax change is among the most deeply
invasive and complex things that can be done. I know that
syntax is also the most obvious to users, it is their interface
to the format afterall! However, each individual shares that
interface with thousands of other people. The maintainers have
to speak for those thousands who never read, much less respond
on this mailing list, and that almost always means that the
response will be one that is decidedly conservative.

I don't mean to be dismissive of the suggestion, but a lot of
time is spent on this list walking back ideas that have not
had sufficient time put into understanding what the
unintended consequences would be, so I wouldn't say
that it is irresponsible, I would say instead that it lacks
sufficient rigor and depth to be seriously considered. If you
can add those to this proposal (e.g. in the form of a patch)
then I suspect it would get a much warmer reception.

Best,
Tom



Re: Some commentary on the Org Syntax document

2021-12-03 Thread Tom Gillespie
Hi Timothy,
   Replies in line. Some things might seem a bit out of order
because I responded from bottom to top. Best,
Tom

> from heading to bed, so to quote Pascal "I have only made this letter
> longer because I have not had the time to make it shorter".

Likewise, and I've heard it as Mark Twain :D

> I think a a big problem is the mix of implicit and explicit information.
> Some components are rigorously specified in terms of the characters they
> may contain, elements and objects that are recognised inside them, and
> even the order in which different parts of the pattern are parsed.

I agree completely.

> As mentioned originally, the current Dynamic Blocks description doesn't
> even mention the CONTENTS part of the pattern, and relies on the reader
> inferring that it operates similarly to the CONTENTS part of Drawers.

Indeed this should be fixed.

> Forcing the reader to start making inferences like this is a treacherous
> path, and I think I can blame for some of the other issues I've
> experienced. Take for instance the "surely X can't contain a newline?"
> comments I've made. In the Node Properties and Entities descriptions you
> have statements along the lines of "X can contain any character [...]
> except a newline". In my mind this then sets up the reader to interpret
> a similar statement without the "except a newline" clause to mean that
> newlines are permitted.

I agree completely and had almost the exact same experience as you
when I was working on it. As I mention below, my responses were to
illustrate why the explicit information is missing, not to suggest that it
should be left out. We should definitely work to make everything more
explicit so that future readers don't have to go through the same issues
we have.

> I'm also thinking that the term "element" is overworked in the document.
> It's basically pulling tripple duty: you have Elements, Greater
> Elements, and elements which are Elements and/or Greater Elements .

In extreme agreement.

> 3. Section

Technically This isn't part of the syntax, rather it is part of
elisp Org mode's internal representation. I'm not sure I would
even mention sections at all, because they have to do with
the interpretation of the syntax. In a section on the internal
representation for Org sections definitely belong, but they
are incidental. That said, I suspect we will find that they are
useful for talking about the behavior of the file under transformation,
e.g. "headings are not reordered when pressing M-up or M-down,
sections are reordered" this allows us to make it possible to
talk about an Org implementation that has commands that allow
one to switch the headings without moving their associated
sections.

> 5. (Greater Element / Element)

There are issues here with forms that are part of the syntax vs
forms that are part of the intermediate representation. A line
based parser for Org syntax that assembles greater blocks
after the fact and a parser that uses arbitrary lookahead to
truncate on headings won't have the exact same surface
syntax, however they will both have an equivalent in their
intermediate representation that corresponds to a greater
block. Again, very deep in implementation details here,
but trying to force things like sections into the syntax
hierarchy seems confusing to me.

> 7. Object

Paragraph element maybe? Might seem odd for heading titles
to have paragraph scope, but on the other hand it certainly
simplifies the explanation of the grammar. And you can put
an inline footnote in a heading title.

> 8. Pattern / Form

Don't know what to make of this one. Like "Term" these are
incredibly generic.

> 9. Term

Use of "Term" is super confusing to me.

> We could say call (1) Components, (7) Units, (6) Objects, (5) Element or
> Object (why not spell it out to avoid telling people to remember
> something).

I'm not sure we are ready to specify this. One way that we
might try to manage this would be to create a taxonomy of
element types, e.g. top-level elements, paragraph elements,
etc. This would be consistent with the fact that the elisp
implementation of org-element has all of these as an instance
of element.

> I could have put more thought into this, but it should do for
> illustrating my line of thinking. Let me know if you have any good
> ideas.

Let's leave the terminology as is right now. I'm expecting that there
will be quite a few new terms that we will want to introduce and we
will want to separate syntax and intermediate representation.

With progress on using org-element for fontification and on laundry
we should be able to come up with language that can be used to
distinguish between concepts that are needed for syntax, (tokens,
parser) and for intermediate representations. Things like basic syntax
highlighting need only the langua

Re: Concrete suggestions to improve Org mode third-party integration :: an afterthought following Karl Voit's Orgdown proposal

2021-12-06 Thread Tom Gillespie
Hi all,
I have a much longer mail in the works, a quick one for now.

I think it is a major strategic mistake to exclude discussions
about interoperability from this list. As Bastien pointed out in
his talk at Emacsconf there is only a single list for both users
and developers. Discussion about interoperability with tools for
working with Org are entirely valid subjects for the user
list. Obviously help and support for other tools is not valid for
the list, but questions about interoperability or incorrectness
of some external tool should always be valid.

We must provide strong technical leadership for all tools that
want to work with Org syntax otherwise we risk it spiraling out
of control. Forcing discussions off list will split the community
and I think the fact that Karl's work made it to this list so
late in the process shows the danger of trying to exclude certain
discussions.

I follow this list, I keep the community up to date with my work,
I have no idea where to look for other Org related dicussions,
nor frankly do I have time to look for them. I suspect I am not
alone in this.

Whether a certain portion of the Org community likes it or not,
there is another portion for whom Org syntax already has a life
beyond Org mode (e.g. academic papers and computation notebook
style workflows). For some workflows documents written in Org
syntax are a primary exchange format and format of record, not
just an internal format from which documents for sharing are
generated. The plain text nature of Org syntax and the freedom
that it enables also means freedom from Emacs. Empowering users
to own and control their own data to use with their own tools is
the whole point. The fact that this means that it works outside
Emacs is a critical feature for many data preservation use cases.

Enough for now. Best!
Tom



Re: Org-syntax: Intra-word markup

2021-12-04 Thread Tom Gillespie
Hi all,
After a bunch of rambling (see below if interested), I think I have
a solution that should work for everyone. The key realization is that
what we really want is the ability to have a "parse me separately"
type of syntax. This meets the intra-word syntax needs and might
meet some other needs as well.

The solution is to make @@org:...@@ "parse me separately"
block! It nearly works that way already too! To minimize typing
we could have @@:...@@ the empty type default to org.

This seems like a winner to me. The syntax for it already exists
and won't conflict. It requires relatively minimal additional typing
the implication is clear, and there are other places where such
behavior could be useful.

This syntax seems like a winner to me
@@org:/hello/@@world
@@:/hello/@@world

You can also do things like
#+begin_src org
I want a number in this number@@org:src_elisp{(+ 1 2)}@@word!
#+end_src

Which would render to
#+begin_src org
I want a number in this number3word!
#+end_src

Thoughts?

Best!
Tom

--- rambling below -


> This idea reminds me a bit of Scribble/Racket where every document is
> just inverted code, which makes it possible to insert arbitrary Racket
> code in your prose...

I will say, despite some of my comments elsewhere, that I think
exploring certain features of Scribble syntax for use in Org mode
would simplify certain parts of the syntax immensely.

For example
various inline blocks are an absolute pain to parse because they
allow nested delimiters /if they are matched/. The implementation
of the /if they are matched/ clause is currently a nasty hack which
generates a regular expression that can only actually handle nesting
to depth 3. Actually implementing the recursive grammar add a lot
of complexity to the syntax and is hard to get right.

It would be vastly simpler to use Scribble's |<{hello }} world}>|
style syntax and always terminate at the first matching delimiter.
I'm sure that this would break some Org files, but it would make
dealing with latex fragments and inline source blocks and inline
footnotes SO much simpler. Matching an arbitrary number of
angle brackets does add some complexity, but it is tiny compared
to the complexity of enforcing matched parens and their failure cases
especially because many of the places where nesting is required
probably only see use of the nesting feature in a tiny fraction of
all cases.

One other reason why this is attractive is that all the instances
where nested delimiters can appear on a line are preceded by
some non-whitespace character. This means that using the
pipe syntax does not conflict with table syntax!

Now the question comes. If we could implement this for
delimiters, could we also implement something similar
for markup? The issue with the proposed markup outside
delimiter inside approach is that it will change existing
behavior for files that want the delimiters to be included
in the markup, i.e. /{oops}/ becoming /oops/ is bad. A
second issue is that putting the delimiter inside the markup
cannot work for verbatim and code ={oops}= is ={oops}= no
matter what. Therefore the solution is not uniform across all
types of markup. We need another solution that works for
all types of markup.

What if we put the "start arbitrary markup" char outside
the markup? Say something like |/ital/|icks? Or what if
we went whole hog and used |{/ital/}|ics and made the
|{...}| syntax trigger a generalized feature where the
contents of the |{...}| block are parsed by themselves
and can abutt any other text? This would be generally
useful in a variety of situations beyond just intra-word
markup.

What are the issues with this approach? The first issue
is that there is a conflict with table syntax if we were to
use the pipe character because markup can appear at
the start of a line. The second issue is that it might be
confusing for users if |{}| also worked like {} when in the
context of latex elements or inline src blocks, or maybe
that is ok because |{}| never renders as text. Hrm. Ok.
Second issue resolved, but what to do about the first?

If we want generalized "parse this by itself" syntax so
that we can write hello|{/world/}|ok, then we need a
solution that can appear at the start of a line. So we
can't use pipe because that is always a table line even
if a zero width space is put before it ;). What other
options do we have? How about #+|{/hello/}|world for
the start of a line? As long as there is no trailing colon
it isn't a keyword, so it could work ... except that if
someone reflows the text and it is no longer a the
start of a line then the syntax breaks. That is to say
using #+| at the start of a line is not uniform, so we
can't take that approach.

What other chars to we have at our disposal? Hrm.
How about @@? Could we use that? What happens
if we use @@org:/hello/@@world? Or maybe if we
want to minimize the number of chars we could do
@@:/hello/@@world and have the empty prefix in
@@ blocks mean org?



Re: Org-syntax: Intra-word markup

2021-12-04 Thread Tom Gillespie
> Since org is a valid export backend though, perhaps this behaviour should be
> reserved for @@:…@@, i.e. no export backend, which I think semantically fits
> fairly nicely.

This ends up being even more convenient than I initially realized.
The current spec for export snippets is ambiguous when it says
"NAME can contain any alpha-numeric character and hyphens"
but the implementation behavior requires that "any" means "at
least one" and is implemented using the + regex operator.

What this means is that @@:...@@ syntax is not actually used
in Org at all at the moment and renders as plain text. I agree that
we need to avoid @@org:..@@ because it has legitimate uses.
Making a back-end of empty string valid for parse separately
syntax thus makes @@ syntax more regular overall, and allows
@@:...@@ to be processed separately because it currently
never enters the export snippet processing.

This is important because export snippets do not seem to be easily
accessible to earlier phases of the org-export machinery, i.e. there
isn't a nice centralized place to preprocess @@org:...@@ even
if we wanted to. On the other hand @@:...@@ isn't processed
at all. I could be missing something in the org export code though.

It will take a bit of work to get this behavior implemented I think,
but it doesn't seem to have any conflicts. Some users may have
set the empty backend to expand manually via
org-export-snippet-translation-alist, but as long as we give
org-export-snippet-translation-alist priority and warn people
that setting "" manually will disable the new functionality
then there shouldn't be any disruption. The behavior also sort
of matches what we would want the empty string to be in this
case, which is "all backends" and of course the only markup
that makes sense for "all backends" is org itself!

Best,
Tom



Re: Parens matching errors in org-babel code blocks

2021-12-21 Thread Tom Gillespie
Definitely a known issue. No easy way to fix it without someone doing
a deep pass on syntax propertization I think. I have a version of
rainbow delimiters mode that tries to work around this at least for
font locking, but it is severely broken and has some nasty quadratic
performance issues in large files. I'll have to look into the proposed
solution that Tim mentions, I may have missed it (unless it was the
solution for <> that John mentions in the linked thread, in which case
that one is not sufficient). Here is a discussion from back in April.
https://lists.gnu.org/archive/html/emacs-orgmode/2021-04/msg00031.html

Best,
Tom



Re: [PATCH] Accept more :tangle-mode specification forms

2021-11-18 Thread Tom Gillespie
Hi Timothy,
The confusion with 755 and "755" could lead to security issues in
cases like 600 vs "600" vs #o600. The need to protect against the 600
case is fairly important, however I don't think there is anything we
can do about it, because someone might want to enter their modes as
base 10 integers.

If we were to prepend every integer with #o (or setting the radix to 8
when reading this particular field) before passing it to
org-babel-parse-header-arguments then it would be impossible to use
base 10 integers unless they were provided in the #10r600 form (Emacs
doesn't support #d600 notation).

I think the best bet is to change the radix for bare integers to 8
when reading that particular header, however I don't know how complex
that would be to implement.

If we don't want to change the radix to 8 then here are some suggestions.

If #o0600 already parses correctly, then I suggest we leave things as
is. Adding complexity just to drop the leading # seems wasteful.

We may want to warn or raise an error if someone uses a value such as
the base 10 integer 600 which does not map to the usual expected octac
codes so that they don't silently get bad file modes that could leave
files readable to the world.

Best,
Tom



Re: "Orgdown", the new name for the syntax of Org-mode

2021-11-28 Thread Tom Gillespie
> I believe (IMHO) that it does not make much sense to separately name the
> Org Mode syntax (as a markup language). That would only generate
> confusion among users.

This is unfortunately not the case. Conflating Org mode which is an Emacs
major mode with Org syntax is a major communication barrier that leads to
confusion for anyone trying to implement a tool based on Org syntax. For
example I couldn't just call my implementation of an org-mode-like package
for Racket "Org mode" because it is not an Emacs major mode. The absence
of a name for Org syntax hampers search and discovery. I'm happy to keep
using the multi-word term Org syntax, but I have found a practical need to
distinguish the surface syntax from the Emacs major mode to reduce
confusing for technical users. Best,
Tom

PS Another brainstormed name: Orgsyn?



Re: "Orgdown", the new name for the syntax of Org-mode

2021-11-28 Thread Tom Gillespie
I had jokingly suggested "orgup" to have a more positive feeling (up
instead of down) than markdown. I'm not sure orgdown will be any more
confusing than some other name. It could imply a version of the org
syntax that uses markdown surface syntax, but it seems that that would
probably be called org flavored markdown by the existing conventions
in the markdown community. Best,
Tom



Re: Some commentary on the Org Syntax document

2021-12-02 Thread Tom Gillespie
Hi Timothy,
Replies in line. Best!
Tom

On Thu, Dec 2, 2021 at 1:32 AM Timothy  wrote:
>
> Hi All (& Nicolas in particular again),
>
> With my recent efforts to write a parser based on
> <https://orgmode.org/worg/dev/org-syntax.html>, I’ve developed a few thoughts 
> on
> that document. Hopefully, they can lead to some improvements and
> clarifications.
>
> 
>
> As a general comment, in many places the Org Syntax document states what
> characters a component can contain, but not what objects/elements. This feels
> like a bit of a hole in the current specifications.

This is indeed confusing because there are some implicit constraints
that are not
listed because they never come up. For example, you cannot have two newlines
inside an inline footnote because the two newlines break the paragraph and the
thing that appears to be an inline footnote is just plain text that is
never terminated.

Ensuring that font locking is in sync org-element and org-export is
critical to ensure
that users know what will actually happen.

>
>
> Sections
> 
>
> Heading
> ───
>
> ⁃ Ok, so `TITLE' can have any character but a newline, but what Org 
> components can it contain?
>   I’m going to assume any object?

Via org-element-object-restrictions it is standard-set-no-line-break which is
all elements except citation-reference, table-cell, and line-break.

>
>
> Affiliated Keywords
> ═══
>
>
> Greater Elements
> 
>
> Greater blocks
> ──
>
> ⁃ It is not explained what is ment by a “special block”
> ⁃ Aren’t lines starting with `#+' also quoted by a comma?
>
>
> Drawers and Property Drawers
> 
>
> ⁃ “Contents can contain any element but another drawer”
>   • Does “any element” mean “any Element or Greater Element”

Any element that does not have greater precedence, so that would
be only a heading.

>
> Dynamic Blocks
> ──
>
> ⁃ It is not specified what `CONTENTS' may be

Implicitly follows the same rules as drawers, no headings
and no nesting of dynamic blocks. Text should be added
that states this explicitly.

> ⁃ Surely `PARAMETERS' cannot contain a newline?

Termination by newline is implicit in the example, but the text is confusing.

> Plain Lists and Items
> ─
>
> ⁃ It is not completely clear what content an item may have.
>   I assume any Object?

By my reading it may contain anything, objects and elements,
except for a heading, but that is already implied by the de-indent.

To quote from the docs:

An item ends before the next item, the first line less or equally
indented than its starting line, or two consecutive empty lines.
Indentation of lines within other greater elements do not count,
neither do inlinetasks boundaries.

This makes plain lists one of the most complex elements to parse.

>
> Tables
> ──
>
> ⁃ Surely newlines are not allowed in `FORMULAS'

No newlines are implicit in the use of "lines" but still confusing.

>
> Elements
> 
>
> Clocks
> ──
>
> Two allowed forms are listed, but are all four of the below allowed or only 
> two?
> ┌
> │ CLOCK: INACTIVE-TIMESTAMP
> │ CLOCK: INACTIVE-TIMESTAMP DURATION
> │ CLOCK: INACTIVE-TIMESTAMP-RANGE
> │ CLOCK: INACTIVE-TIMESTAMP-RANGE DURATION
> └

No. Only the two are allowed. An inactive timestamp alone is a
starting point, adding a duration without the end point means
that there is no way to check that the range and duration match.

> All the best,
> Timothy



Re: Org-syntax: Intra-word markup

2021-12-02 Thread Tom Gillespie
I don't mean to be a wet blanket, but the edge cases for
the current markup syntax are already hard enough to
implement correctly, to the point where different parts of
Org mode are inconsistent. Intra-word markup isn't viable
because there simply isn't any sane way to parse something
like *hello world*/hrm/oh no*. The other issue is that this will
degrade parsing performance because almost every
character could precede the start of a markup section.

I recommend anyone suggesting solutions try to implement
something that can parse the markup unambiguously with
lots of nasty test cases. You will likely find that it is impossible
to consistently tokenize markup, and that you have to hand
write a whole bunch of heuristics, making Org syntax even
harder to implement correctly.

Any solution that suggests extending how =/*~+_  can be
used gets a hard no from me. I could see teaching other
exporters how to interpret \emph{hello}world, but trying for
to have any sane behavior for something like
why *hello*world oh no a wild askterisk*
is not worth it.

Best,
Tom



Re: Orgdown: negative feedback & attempt of a root-cause analysis (was: "Orgdown", the new name for the syntax of Org-mode)

2021-11-30 Thread Tom Gillespie
Karl,
   The exact naming of a thing is nearly always the most contentious
step in trying to promulgate it. In my own field we can easily get all
parties to agree on a definition, but they refuse to budge on a name.
As others have said, I wouldn't worry about kibitizing over the name.

I would however worry about the larger negative reaction. From my
perspective I think the issue is that there are many efforts working
toward a formalized specification for Org syntax and Org mode
functionality, and some of those stakeholders who have invested
significant effort may feel blindsided by a public declaration
announcing Orgdown because they were not consulted and not
made aware that you were working on it.

I appreciate the amount of work that you have put in, I have devoted
hundreds of hours to working on an alternate implementation of org
in Racket that uses a formal ebfn in hopes that others will be able
to use it as a guide and as a way to talk formally about how Org
parsers and implementations should behave.

It would thus be easy for me to say that your approach has put the
cart before the horse, because there are countless nuances in the
specification for Org syntax which must be addressed before any
levels of org compliance can be specified, otherwise the behavior
between levels will be inconsistent.

If I were to say this, it would not be fair to you at all. The ideas
and motivation for Orgdown are vital and important. You have put
in enormous thought and effort, all because you care about Org
and want to see it succeed.

The issue is that any shared specification for Org syntax is
fundamentally about how to coordinate as a community.
The way that Orgdown was presented to the community feels
(to me) like it is being imposed top down or coming from an
individual source, not from an open and visible community
process (the subject of your original email reads as a declaration
in english, and thus can be quite off putting, though I know that
was not the intention).

I personally haven't bothered with promulgation because I think
that we are not technically ready as a community to approach
outreach to other developers in a way that we can succeed.

The good news is that all of this can co-exist if we want it to,
but we need to be clear about our objectives as a community.

To me these objectives are as follows (and I would love
to hear from others about additional or alternate objectives).

1. To never fracture Org syntax so as to avoid the nightmare
of markdown flavors. (This means being able to say clearly
as a community that a parser is out of compliance and that
it is up to the user to fix their files. The ruby org parser used
by Github is a major issue here.)
2. To provide a clear specification for what graceful degradation
looks like when parsing Org syntax if a parser does not support
some portion of that syntax (e.g. should property drawer lines
be excluded or rendered as plain text?).
3. Provide a solid basis on which further formal specification
can be built. (My interests in particular are around providing
consistent semantics for org-babel blocks across languages
so that babel implementations can clearly communicate what
runtime features they support.)

The approach for Orgdown can absolutely meet all three of
these objectives, however in its current form Orgdown1 is not
sufficiently well specified to avoid fracturing the syntax.
This is because Org syntax is extremely complex (even the
elisp implementation of Org mode is internally inconsistent)
and there are edge cases where behavior will diverge if parsing
of even the simplest elements is not fully specified.

There are many ways to remedy this, however they require
a more formal approach. A number of us are working to build
technical foundations for such a formal approach, but I do not
think that any of those projects are ready to be used to
specify discrete levels of Org syntax parsing compliance.

If I may, I would suggest that an Orgdown0 is something that
could be well specified, but it would avoid parsing of markup
altogether and only deal with the major element types. Parsing
paragraphs and all the org objects is not something that can
be done piecemeal. There are too many interactions between
different parts of the syntax, and in some cases the existing
specification desperately needs to be revisited due to the
complexity that it induces or because it is underspecified.
Of course this would make Orgdown0 fairly useless as a
replacement for markdown, but at least it would be a start.

Best,
Tom



Re: noweb and shell heredocs

2021-11-30 Thread Tom Gillespie
Hi Łukasz,
One workaround that is fairly reliable is to prefix the names
of the blocks to be nowebbed with an &. So #+name: block-name
becomes #+name:  Then you reference it as
<<>> and the heredoc syntax is broken. Best,
Tom



Re: Formal syntax for org-cite

2021-11-30 Thread Tom Gillespie
Hi Timothy,
Thanks for putting this together. Comments in line. Best!
Tom

For reference here is the tokenizer pattern I use in laundry at the moment.
There are a number of issues with it ...
https://github.com/tgbugs/laundry/blob/5a396bef98d9a3cd9ee929f21cd47612dd6cb1ac/laundry/lex-abbrev.rkt#L896-L913

> Citation syntax is currently not documented, but from the implementation
> it looks something like this:
> #+begin_example
> [cite CITESTYLE: GLOBALPREFIX KEYCITES GLOBALSUFFIX]
> #+end_example

There is potential confusion here because =[cite= does not have to be
followed by a space (rather, cannot be).

The top level syntax is =[cite= terminating at the first occurrence of =]=.
I think we may also need to include a note that no whitespace is allowed either?
It will only be recognized within paragraph context (e.g. headings, paragraphs,
and other places where org objects can appear). Stating that up front would
clarify that the rest of the syntax described here is how to determine whether
the citation is well formed/how to parse it.

> =KEY= can be made of any word-constituent character, =-=, =.=, =:=, =?=,
> =!=, =`=, ='=, =/=, =*=, =@=, =+=, =|=, =(=, =)=, ={=, =}=, =<=, =>=,
> =&=, =_=, =^=, =$=, =#=, =%=, =%=, or =~=.

You have a duplicated =%= here.

> I have not yet confirmed what =KEYPREFIX= and =KEYSUFFIX= may contain,
> but as a starting point, any of the characters allowed in =KEY= except
> =@= plus whitespace would seem fairly safe. =KEYSUFFIX= must start with
> a whitespace character to be able to be differentiated from =KEY=.

I don't think we can allow whitespace here?

> =CITESTYLE= consists of a main =STYLE= and any number of =VARIANT=s
> (including zero), prefixed by forwards slashes in the following pattern
> #+begin_example
> /STYLE/VARIANT/VARIANT/VARIANT
> #+end_example

Need clarification on empty syles e.g. [cite//:]

> "cite" and =CITESTYLE=, =KEYCITES= and =GLOBALSUFFIX= are /not/
> separated by whitespace. Neither are =KEYPREFIX=, =@KEY=, or =KEYSUFFIX=
> separated by whitespace.

I may be missing something, but this is confusing with respect to the
statement about =KEYSUFFIX= and whitespace made above.



Re: [PATCH] ob-core: tangle check library of babel after current buffer

2021-07-17 Thread Tom Gillespie
Pinging on this to see if anyone can test it so that it can be merged.
Tom

On Wed, Jun 16, 2021 at 4:29 PM Tom Gillespie  wrote:
>
> Hi,
>This is a patch that fixes tangling behavior when a block has been
> ingested into the library of babel and then modified. Best!
> Tom



Re: Headings and Headlines

2021-07-23 Thread Tom Gillespie
I enthusiastically support changing the documentation to use heading.
I use heading in my formal grammar because I have found there are more
ways that it can be modified and remain grammatically correct when
used in english sentences. The internal implementation in elisp still
refers to headlines, but changing the docs would be a good first step.
Best!
Tom



RE: Timestamp parsing inside node properties and other contexts out of org-element-object-restrictions (was: [BUG] Agenda no longer works for timestamps inside properties drawer [9.5.2 (release_9.5.2-

2022-03-22 Thread Tom Davey
Hi Tim, 

Thanks for these thoughtful comments. I agree that the Org developers (to
whom I, as a mere user, owe enormous thanks) must be wary before making
changes to how timestamps are handled. 

This argues, I would say, for keeping what I believe was the status quo
since at least Org version 9.4.4: Agenda views would display entries with
active timestamps in property drawers. That has been my historical
experience. 

Tim, has your historical experience been different? In the invoicing example
you give, were the timestamps in the properties drawer active, or inactive? 

I have just verified with a simple test that Org version 9.4.4, which was
shipped with Emacs 27.2 I believe, does display entries with an active
timestamp as the value of a property in the ordinary :PROPERTIES: drawer.
That's the situation I'm calling the "status quo." I'm wondering if my
experience coincides with the experience of others. 

Here's the simple entry that will be shown on the Week/Day Agenda view in
9.4.4: 

* TODO Test of active timestamps
   :PROPERTIES:
   :Created:  <2022-03-22 Tue 18:30>
   :END:

And note this: adding a second active timestamp to the test entry, e.g., to
accompany a SCHEDULED: keyword, results in the entry appearing on the Agenda
twice, as would be expected: 

* TODO Test of active timestamps
   SCHEDULED: <2022-03-22 Tue 18:30>
   :PROPERTIES:
   :Created:  <2022-03-22 Tue 18:30>
   :END:

This second example shows why the variable
org-agenda-skip-additional-timestamps-same-entry is valuable. I rarely want
an entry to display twice on the same day. 

Tom Davey 

--
Tom Davey
t...@tomdavey.com
New York NY USA

-Original Message-
From: Emacs-orgmode  On
Behalf Of Tim Cross
Sent: Tuesday, March 22, 2022 5:10 PM
To: Ihor Radchenko 
Cc: Ignacio Casso ; emacs-orgmode@gnu.org;
t...@tomdavey.com; Nicolas Goaziou 
Subject: Re: Timestamp parsing inside node properties and other contexts out
of org-element-object-restrictions (was: [BUG] Agenda no longer works for
timestamps inside properties drawer [9.5.2 (release_9.5.2-24-g668205 @
/home/ignacio/repos/emacs/lisp/org/)])


Ihor Radchenko  writes:

> Ihor Radchenko  writes:
>
>> After further reading the source code, I figured that agenda is, in 
>> fact, supposed to handle timestamps inside property drawers. Optional 
>> arguments for org-at-timestamp-p imply that, in agenda specifically, 
>> timestamps inside node properties are considered timestamps despite 
>> they are not being parsed as timestamps by org-element.
>
> Even though I fixed the reported issue with agenda not showing 
> headings with matching timestamps inside property drawers, this 
> situation is revealing a big inconsistency in Org mode's handling of
timestamps.
>
> org-at-timestamp-p usage implies that Org syntax for timestamps is not 
> only context-dependent, but also depends on current command!
>
> org-at-timestamp-p is called with non-nil argument in a number of 
> functions in Org:
> - org-clock-timestamps-change
> - org-mouse-delete-timestamp
> - org-mouse-context-menu
> - org-follow-timestamp-link
> - org-get-repeat
> - org-auto-repeat-maybe
> - org-time-stamp
> - ... many more in org.el
>
> So, depending on the current command, Org may on may not treat objects 
> matching org-ts-regexp-both as timestamps.
>
> This situation complicates syntax and makes org-element unreliable 
> when dealing with Org buffers.
>
> Should we just simply allow timestamps to be a part of node property 
> values? Should we _not_ treat timestamp-looking text outside their 
> allowed contexts (like quotes, source blocks, etc) as timestamps?
>

I think we have to be very wary here. I can see any changes here causing
lots of breakage for people. I know for my own use case, I use timestamps a
lot in property draws and various source blocks. I never want any of them
showing up in my agenda.

As an example, I was recently working for a company which required that you
put a timestamp in both a file header and in comments. The format they used
was pretty much the same as an org-mode active timestamp. I use org mode to
tangle the source files I write (as well as manage my client data, such as
todos, invoicing, contacts etc), so these files are searched for agenda
items, but I do not want any of those timestamps causing lines in my agenda
views. These timestamps are most commonly found in source and example
blocks.

I think the only time an org timestamp should be recognised in a source
block is when that source block is an org-mode source block. I don't think
they should ever be 'recognised' in example blocks.  

IN addition, my invoicing solution, which is based on org, uses timestamps
to track invoice periods etc. None of these should ever appear in the
agenda. This information is typically tracked in property draws.

Unfortunately, I think org times

RE: [BUG] Agenda no longer works for timestamps inside properties drawer [9.5.2 (release_9.5.2-24-g668205 @ /home/ignacio/repos/emacs/lisp/org/)]

2022-03-21 Thread Tom Davey
Ihor writes: 

> I personally see allowing timestamps (and links) inside property values as a 
> useful feature. 
> Would it be of interest for other users?

Yes, it's a quite useful feature. For years, via my Capture templates, I've 
been adding a property named :Created: to the properties drawer as follows: 

:PROPERTIES:
:Created:  <2022-03-06 Sun 22:42>
:END:

Now, in 9.5.2, literally hundreds of entries that formerly appeared on the 
built-in Agenda views cannot be easily found. 

Regards to all, 
Tom 

PS The variable 'org-agenda-skip-additional-timestamps-same-entry seemed 
expressly made for my use case, to clean up same-day clutter in entries with a 
TODO timestamp.

--
Tom Davey
t...@tomdavey.com
New York NY USA

-Original Message-
From: Emacs-orgmode  On Behalf 
Of Ihor Radchenko
Sent: Saturday, March 12, 2022 7:29 AM
To: Ignacio Casso 
Cc: emacs-orgmode@gnu.org
Subject: Re: [BUG] Agenda no longer works for timestamps inside properties 
drawer [9.5.2 (release_9.5.2-24-g668205 @ /home/ignacio/repos/emacs/lisp/org/)]

Ignacio Casso  writes:

> In Emacs 27.2, with an up to date version of org from ELPA (9.5.2), 
> org-agenda considers timestamps that appear in property drawers, so 
> the entry below appears in the daily agenda view.
>
> * Heading
>   :PROPERTIES:
>   :timestamp: <2022-03-12 sáb>
>   :END:
>
> However, in the latest Emacs version built from source (29.0.50), with 
> the built-in version of org (also 9.5.2, but the latest release, I 
> assume), this is no longer the case and that entry does not appear in 
> the agenda view.
>
> I know that maybe it's unorthodox, but I have some org files that rely 
> in the previous behavior, with entries like the following:
>
> * Some friend
>   :PROPERTIES:
>   :birth-date: <1994-03-12 sáb +1y>
>   :END:
>
> Is this a bug? If it's not, can someone point me to the functions I 
> need to touch to restore the previous behavior? Or maybe I should stop 
> doing this and start moving those timestamps out of the properties 
> drawer in my files?

What you see in the new Org version is not a bug. Property values are treated 
as plain text by Org.

In the older versions, agenda code did not rely on Org's internal parsing and 
matched timestamps in places where timestamps are not allowed (inside code 
blocks, for example). See 
https://orgmode.org/list/20220101122409.ga29...@itccanarias.org

Dear all,

I was unable to find a place in manual describing that timestamps cannot be 
placed inside property values:

>> A timestamp is a specification of a date (possibly with a time or a 
>> range of times) in a special format, either ‘<2003-09-16 Tue>’ or
>> ‘<2003-09-16 Tue 09:39>’ or ‘<2003-09-16 Tue 12:00-12:30>’(1).  A 
>> timestamp can appear anywhere in the headline or body of an Org tree 
>> entry.  Its presence causes entries to be shown on specific dates in 
>> the agenda (see *note Weekly/daily agenda::).  We distinguish:

I personally see allowing timestamps (and links) inside property values as a 
useful feature. Would it be of interest for other users?

In any case, we should probably clarify manual in this regard.

Best,
Ihor





RE: [BUG] Agenda no longer works for timestamps inside properties drawer [9.5.2 (release_9.5.2-24-g668205 @ /home/ignacio/repos/emacs/lisp/org/)]

2022-03-21 Thread Tom Davey
Ignacio writes: 

> I've located the line in org-agenda.el responsible of the new behavior,
> and the following patch seems to fix it. I suggest it is incorporated
> into the repository, maybe with a variable org-agenda-skip-timestamps-
> in-properties-drawer defaulting to t if not everyone agrees.

I second that suggestion for the repository! Thanks very much for the patch.

I think you are correct in supposing that when Emacs 28.1 is released, many
Org users who upgrade will be mystified at the new timestamp behavior and
will spend time without success trying to figure out what changed.

Perhaps the new variable you propose,
org-agenda-skip-timestamps-in-properties-drawer, should default to nil to
preserve the historical behavior? 

--
Tom Davey
t...@tomdavey.com
New York NY USA

-Original Message-
From: Emacs-orgmode  On
Behalf Of Ignacio Casso
Sent: Monday, March 21, 2022 7:21 PM
To: t...@tomdavey.com
Cc: emacs-orgmode@gnu.org
Subject: Re: [BUG] Agenda no longer works for timestamps inside properties
drawer [9.5.2 (release_9.5.2-24-g668205 @
/home/ignacio/repos/emacs/lisp/org/)]


>> What you see in the new Org version is not a bug. Property values are 
>> treated as plain text by Org.
>>
>> I was unable to find a place in manual describing that timestamps 
>> cannot be placed inside property values:

>> I personally see allowing timestamps (and links) inside property values
as a useful feature.
>> Would it be of interest for other users?
>
> Yes, it's a quite useful feature. For years, via my Capture templates,
I've been adding a property named :Created: to the properties drawer as
follows:
>
> :PROPERTIES:
> :Created:  <2022-03-06 Sun 22:42>
> :END:
>
> Now, in 9.5.2, literally hundreds of entries that formerly appeared on the
built-in Agenda views cannot be easily found.


It seems that I'm not the only one using this unintended feature in previous
versions of org, and probably there will be many others who don't use the
latest version of org and have not noticed yet but will have the same
problem when they upgrade.

I think that even if timestamps were never intended to be used inside
property drawers before, the fact that it worked for a long time and nothing
in the documentation suggested otherwise makes it a de facto feature, even
if unintended, and should be preserved.

I've located the line in org-agenda.el responsible of the new behavior, and
the following patch seems to fix it. I suggest it is incorporated into the
repository, maybe with a variable
org-agenda-skip-timestamps-in-properties-drawer defaulting to t if not
everyone agrees.





Re: [BUG] Make SVG + LaTeX work by default [9.5.2 (release_9.5.2-9-g7ba24c @ /Users/salutis/src/emacs/nextstep/Emacs.app/Contents/Resources/lisp/org/)]

2022-01-30 Thread Tom Gillespie
I do not think we can add -shell-escape by default because it
is an arbitrary code execution vector. It might be good to add
a setting in org that would do the right thing without requiring
a user to understand the arcana of latex cli options though.
Best,
Tom



Re: Suggestion: convert dispatchers to use transient

2022-02-03 Thread Tom Gillespie
The backward compatibility requirements for org mean that it won't be
possible to replace the existing implementation
for quite a while. That said, I imagine that having optional transient
dispatchers for users on newer versions of emacs would be appreciated.
Best,
Tom



Re: Org Syntax Specification

2022-01-18 Thread Tom Gillespie
Hi Ihor,
  Thank you very much for the detailed responses. Let me start with
some context.

1. A number of the comments that I made fall into the brainstorming
   category, so they don't need to make their way into the document at
   this time. I agree that it is critical for this document to capture
   how org is parsed right now and that we should not put the
   pie-in-the-sky changes in until the behavior of org-element matches
   (if such a change is made at all).
2. Though I haven't been hacking on it, I fully intend to contribute
   test cases and exploratory work on org-element in the future, so
   please don't interpret some of what I am writing as requests for
   other people to write code (unless they want to :)
3. When I say grammar in this context I mean specifically an eBNF that
   generates a LALR(1) or LR(1) parser. This is narrower than the
   definition used in the document, which includes things that have to
   be implemented in the tokenizer, or in a pass after the grammar has
   been applied, or are related to some other aspect beyond the pure
   surface syntax.
4. A number of my comments are about the structure of the document
   more than the structure of the syntax or the implementation. I
   think that most of them are trying to ask whether we want to
   clearly delineate pure surface syntax from semantics to make the
   document easier to understand.

More replies in line.
Best!
Tom

> As for your other comments, you seem to be suggesting a number of
> changes to the existing Org syntax. Some of them looks fine, some are
> not. However, please keep in mind that we have to deal with back
> compatibility, third party compatibility, and not breaking existing Org
> documents unless we have a very strong justification. I suggest to
> branch a number of new threads from here for each concrete suggestion
> where you want to make changes to Org syntax, as opposed to just
> document wording. Otherwise, this discussion will become a total mess.

Agreed. I put many of these in here as notes from my experiences, I
will branch those off into separate discussions so that we don't
pollute this thread.

> Nope. Sections are actually elements. See =org-element-all-elements=.

I realized this at a slightly later date but missed cleaning up this
comment.  See my response on section vs segment below.

> I disagree. Nesting rules are the important part of syntax. We have
> restrictions on what elements can be inside other element. The same
> patterns are not recognised in Org depending on their nesting. For
> example, links that you put into property drawers are not considered
> link objects.

When I wrote this comment I was still confused about sections.I think
discussion of nesting in most contexts is ok, but there are some case
where nesting cannot be determined from the grammar, and there I think
we need to make a distinction.

In my thinking I separate the context sensitive nature of parsing from
the nesting structure of the resulting sexpressions, org elements,
etc.The most obvious example of this is that the sexpression
representation for headings nests based on the level of the heading,
but heading level cannot be determined by the grammar so it must be
reconstructed from a flat sequence of headings that have varying level.

> Again I disagree. While your idea about table cells is reasonable
> (similar for citation-references inside citations), I am against
> decoupling Org syntax from org-element implementation. In
> org-element.el, table-cells are just yet another object. If we make
> things in org-element and syntax document out of sync, confusion and
> errors will follow during future maintenance.

Org element treats all elements and objects as a single homogenous
type.  This is fine. However, to help people understand the syntax it
seems easier to define things in a positive way so that we don't say
"all except these two."  Therefore, despite the fact that the
implementation of org-element treats table rows and cells no different
from any other node in the parse tree, we don't need to burden the
reader with that information at this point in time, and could provide
that information as an implementation note for cells.  I think the
other issue I was having here is that the spec for tables is spread
allover the place, and it would be much easier to understand and
implement ifit were all in one place.

> This actually reads slightly confusing. "Blank lines separate paragraphs
> and other elements" sounds like blank lines are only relevant
> before/after paragraphs. However, there are also footnote references and
> lists. Maybe we can try something like:
>
> Blank lines can be used to indicate end of some elements.
>
> "can" because a single blank line usually does not separate anything.

I think your version is quite a bit more readable.  Can we list the
set of all the elements t

Re: Problem when tangling source blocks with custom coderefs

2022-01-18 Thread Tom Gillespie
Hi Luis,
   I don't think you are doing anything wrong. IIRC the portion of the
patch that allowed the customization to propagate to the tangled code
was not included. Given that I am no longer the only one who is
looking for/expecting this behavior, maybe it is worth revisiting the
decision. The simplest fix right now would be to prepend your coderef
with the python comment symbols # |hello| so that at the very least it
won't break your tangled files. I would like to see this implemented,
so let's see what Nicolas has to say. Best!
Tom



Re: Org Syntax Specification

2022-01-17 Thread Tom Gillespie
Hi Timothy,
I have attached a patch with some modifications and a bunch of
comments (as footnotes). More replies in line. Thank you for all your
work on this!
Tom

> Marking this as depreciated would have no effect on Org’s current behaviour, 
> but we could:
>
> Mark as depreciated now-ish
> Add a utility to convert from TeX-style to LaTeX-style
> Add org lint/fortification warnings
> A while later (half a decade? more?) actually remove support

In favor of this. There are good alternatives for this now.

> The other component of the syntax which feels particularly awkward to me is 
> source block switches. They seem a bit odd, and since arguments exist, 
> completely redundant.

Extremely in favor of removing switches. There are so many better ways
to do this now that aren't like some eldritch unix horror crawling up
out of the abyss and into the eBNF :)
From 3527331f02e593ec6ba6cb4c8bde3f64de3ad216 Mon Sep 17 00:00:00 2001
From: Tom Gillespie 
Date: Mon, 17 Jan 2022 19:34:21 -0500
Subject: [PATCH] Tom's comments and modifications to org syntax edited

I removed any mention of markdown because it is a distraction in this
document and is not something we want anyone attending to here.

I change "top level section" to "zeroth section" which I think is more
consistent terminology because level is often used to refer to the
depth of parsing at any given point in the file and the top level
refers to anything that can be parsed without context. Zeroth makes it
clear that we are talking about the actual zeroth occurrence of a
section in a file/buffer/stream.
---
 dev/org-syntax-edited.org | 399 +++---
 1 file changed, 331 insertions(+), 68 deletions(-)

diff --git a/dev/org-syntax-edited.org b/dev/org-syntax-edited.org
index c3259473..2e99070d 100644
--- a/dev/org-syntax-edited.org
+++ b/dev/org-syntax-edited.org
@@ -19,9 +19,7 @@ under the GNU General Public License v3 or later.
 Org is a plaintext format composed of simple, yet versatile, forms
 which represent formatting and structural information.  It is designed
 to be both intuitive to use, and capable of representing complex
-documents.  Like [[https://datatracker.ietf.org/doc/html/rfc7763][Markdown]], Org may be considered a lightweight markup
-language.  However, while Markdown refers to a collection of similar
-syntaxes, Org is a single syntax.
+documents.
 
 This document describes and comments on Org syntax as it is currently
 read by its parser (=org-element.el=) and, therefore, by the export
@@ -32,14 +30,13 @@ framework.
 ** Objects and Elements
 
 The components of this syntax can be divided into two classes:
-"[[#Objects][objects]]" and "[[#Elements][elements]]".  To better understand these classes,
-consider the paragraph as a unit of measurement.  /Elements/ are
-syntactic components that exist at the same or greater scope than a
-paragraph, i.e. which could not be contained by a paragraph.
-Conversely, /objects/ are syntactic components that exist with a smaller
-scope than a paragraph, and so can be contained within a paragraph.
-
-Elements can be stratified into "[[#Headings][headings]]", "[[#Sections][sections]]", "[[#Greater_Elements][greater
+"[[#Elements][elements]]" and "[[#Objects][objects]]".  Elements are
+syntactic components that have the same priority as or greater
+priority than a paragraph. Objects are syntactic components that are
+only recognized inside a paragraph or other paragraph-like elements
+such as heading titles.
+
+Elements are further divided into "[[#Headings][headings]]", "[[#Sections][sections]]"[fn::sections are not elements], "[[#Greater_Elements][greater
 elements]]", and "[[#Lesser_Elements][lesser elements]]", from broadest scope to
 narrowest.  Along with objects, these sub-classes define categories of
 syntactic environments.  Only [[#Headings][headings]], [[#Sections][sections]], [[#Property_Drawers][property drawers]], and
@@ -52,7 +49,12 @@ elements that cannot contain any other elements.  As such, a paragraph
 is considered a lesser element.  Greater elements can themselves
 contain greater elements or lesser elements. Sections contain both
 greater and lesser elements, and headings can contain a section and
-other headings.
+other headings. [fn:tom2:I would not discuss strata here because it is
+not related to the syntax of the document. It is related to how that
+syntax is interpreted by org mode. The strata are nesting rules that
+are independent of the syntax, and discussing that here in the syntax
+document is confusing, because the nesting is not something that can be
+parsed directly because it depends on the number of asterisks.]
 
 ** The minimal and standard sets of objects
 
@@ -60,25 +62,33 @@ To simplify references to common collections of objects, we define two
 useful sets.  The

Re: call blocks as a function from inside elisp code

2022-01-19 Thread Tom Gillespie
Hi George,
Here is an example of how I call nested elisp and python. The
python block is an input argument to the elisp block in this case, but
the python block could be called directly as well. I'm not sure how to
pass arguments to the block from inside elisp via org-babel-eval
though, that seems like it would require some deeper
tampering/advising of functions. Best,
Tom

https://github.com/SciCrunch/sparc-curation/blame/master/docs/queries.org#L1704-L1707
#+begin_src elisp :results none :exports none
(ow-babel-eval "neru-simplified")
#+end_src

The implementation I use is included below and is source dfrom
https://github.com/tgbugs/orgstrap/blob/bc981b957967be8d872c08be9ba7f2dbde5caf1d/ow.el#L786-L803

(defun ow-babel-eval (block-name  universal-argument)
  "Use to confirm running a chain of dependent blocks starting with BLOCK-NAME.
This retains single confirmation at the entry point for the block."
  ;; TODO consider a header arg for a variant of this in org babel proper
  (interactive "P")
  (let ((org-confirm-babel-evaluate (lambda (_l _b) nil))) ;; FIXME
TODO set messages buffer size to nil
(save-excursion
  (when (org-babel-find-named-block block-name)
;; goto won't raise an error which results in the block where
;; `ow-confirm-once' is being used being called an infinite
;; number of times and blowing the stack
(org-babel-goto-named-src-block block-name)
(unwind-protect
(progn
  ;; FIXME optionally raise errors on failure here !?
  (advice-add #'org-babel-insert-result :around
#'ow--results-silent)
  (org-babel-execute-src-block))
  (advice-remove #'org-babel-insert-result #'ow--results-silent))

(defun ow--results-silent (fun  args)
  "Whoever named the original version of this has a strange sense of humor."
  ;; so :results silent, which is what org babel calls between vars
  ;; set automatically is completely broken when one block calls another
  ;; there likely needs to be an internal gensymed value that babel blocks
  ;; can pass to eachother so that a malicious user cannot actually slience
  ;; values, along with an option to still print, but until then we have this
  (let ((result (car args))
(result-params (cadr args)))
(if (member "silent" result-params)
result
  (apply fun args



Re: [PATCH] Add support for $…$ latex fragments followed by a dash

2022-01-26 Thread Tom Gillespie
> The change is local and minor.
We can't know that. Consider for example someone that has
the following line somewhere in their files.
#+begin_src org
I spent $20 on food and was paid$-10 dollars by friends so
I am down $10.
#+end_src
Yes =paid$-10= is probably a typo that should have a space
in between, but it could still be in a file and cause an issue.
The more likely case would be of someone that has $ in the
name of a variable that also uses dashes. For example if
I have a list of variable names such as
#+begin_src org
Text a $A_BASH_VAR
Text b some-$-lisp-var
#+end_src

The proposed change would break any file with a pattern like
this.

We have no way of seeing every org file that users have
written so we don't know the extent of the impact, and thus
have to assume that there would be some impact. Making
such a change with an unknown blast radius in the midst of
considering removing support for that syntax altogether is
inviting disaster.

Best,
Tom



Re: [PATCH] Add support for $…$ latex fragments followed by a dash

2022-01-25 Thread Tom Gillespie
> The attached patch adds support for $…$ latex fragments followed by a
> dash, such as $n$-th.

Unfortunately this falls into the realm of changes to syntax. The current
behavior is not a bug and is working as specified because hyphen minus
(U+002D) does not count as punctuation for the purposes of org syntax.
We should specify which chars count as punctuation in the syntax doc.
As noted by Eric \(\) has no such restrictions.

>From https://orgmode.org/worg/dev/org-syntax.html#Entities_and_LaTeX_Fragments
> POST is any punctuation (including parentheses and quotes) or space 
> character, or the end of line.

Best,
Tom



Re: Description list with " :: " in the tag.

2023-09-11 Thread Tom Alexander
Thanks!

-- 
Tom Alexander

On Sat, Sep 9, 2023, at 5:06 AM, Ihor Radchenko wrote:
> "Tom Alexander"  writes:
>
>> Emacs version: 29.1
>> Org-mode version: 163bafb43dcc2bc94a2c7ccaa77d3d1dd488f1af
>>
>> Found a conflict between the documentation and the parser behavior. The 
>> org-mode documentation[1] for description list items says that TAG '[...] 
>> does not contain the substring " :: "'
>>
>> Using this sample document, I have created a plain list item with a tag that 
>> contains that substring by wrapping it in a verbatim block:
>> ```
>> - =foo :: bar= :: baz
>> ```
>>(item
>> ...
>> ((1 0 "- " nil nil "=foo :: bar=" 23))
>> ...
>> It seems that "TAG-TEXT" is not just text but it can include objects and 
>> those objects can include the substring " :: ".
>
> It is simpler.
> Everything after the bullet and before the last " :: " is considered as
> tag. Everything after the last " :: " is description.
> Then, tag and description are parsed, allowing objects inside.
>
> org-syntax document is inaccurate here - it says that the _first_ " :: "
> is used as tag:description delimiter, not the _last_.
>
> I do not see any benefit changing the current parser. So, we probably
> need to update org-syntax document instead.
>
> -- 
> Ihor Radchenko // yantar92,
> Org mode contributor,
> Learn more about Org mode at <https://orgmode.org/>.
> Support Org development at <https://liberapay.com/org-mode>,
> or support my work at <https://liberapay.com/yantar92>



Fixed width areas not allowing tab after leading colon.

2023-09-16 Thread Tom Alexander
The documentation for fixed width areas states: A “fixed-width line” starts 
with a colon character (:) and either a whitespace character or the immediate 
end of the line. 

Using the test document:
```
:foo
```

parses as a paragraph instead of a fixed-width area:
```
(org-data
(:standard-properties
  [1 1 1 7 7 0 nil org-data nil nil nil 3 7 nil # nil nil nil]
  :path nil :CATEGORY nil)
(section
  (:standard-properties
   [1 1 1 7 7 0 nil first-section nil nil nil 1 7 nil # nil 
nil #0])
  (paragraph
   (:standard-properties
[1 1 1 7 7 0 nil top-comment nil nil nil nil nil nil # 
nil nil #1])
   #(": foo\n" 0 6
 (:parent #2)
```

This happens in a document in worg: 
https://git.sr.ht/~bzg/worg/tree/74e80b0f7600801b1d1594542602394c085cc2f9/item/org-contrib/org-bom.org#L499

Emacs version: GNU Emacs 29.1 (build 1, x86_64-pc-linux-musl)
Org-mode version: c703541ffcc14965e3567f928de1683a1c1e33f6 (latest in git)

Fixed-width area documentation: 
https://orgmode.org/worg/org-syntax.html#Fixed_Width_Areas
--
Tom Alexander




Re: [DISCUSSION] May we recognize everything like [[protocol:uri]] as a non-fuzzy link? (was: [BUG] URI handling is overly complicated and nonstandard [9.6.7 (N/A @ /gnu/store/mg7223g8mw90lccp6mm5g6f3

2023-09-01 Thread Tom Gillespie
This is a timely discussion. I have been thinking about how
to deal with prefixes defined by the #+link: keyword which is
directly related to this question.

I think the following might be a solution that also avoids the
issue brought up by Arne.

The original "bug" cannot be resolved because bare URIs
have syntax that conflicts with Org syntax. However I think
we can do better than directing users to org-link-set-parameters.

My suggestion is as follows. Schemes/prefixes defined by the
#+link: keyword can be used without surrounding syntax markers
but may not contain spaces etc. To support this Org parsers
should always parse prefix:suffix as a _putative_ link which
must then be checked against a list of known schemes that
are either built in or have been declared by the user to indeed
be legitimate schemes.

In the tel: case, the way to solve the original bug is simply
to add the line #+link: tel tel: which would tell Org that e.g.
tel:555-555- is a real uri, and that it should expand to
itself.

At the same time this solution would avoid Arne's issue
(which I also have in some of my documents where I have
use fig: and tbl: as prefixes in names and reference them
via [[fig:figure-name]]) because the parser would only treat
prefix: in an internal link as a scheme if it is defined explicitly
by the user in a #+link: keyword or in their init.el.

Thoughts?
Tom



Re: [RFC] Quoting property names in tag/property matches [Was: [BUG?] Matching tags: & operator no more implicit between tags and special property]

2023-09-01 Thread Tom Gillespie
Ignore the previous message. I see that this was about matching
tags not about specifying them. Best,
Tom



Re: [RFC] Quoting property names in tag/property matches [Was: [BUG?] Matching tags: & operator no more implicit between tags and special property]

2023-09-01 Thread Tom Gillespie
Without wading too far into this, why do we need escape syntax for this?
The only character that might need an escape would be colon :, but
my reading of the syntax doc is that colo : will immediately terminate
the property, so we would update the doc to make it clear that property
names cannot contain a colon. As written, if there is an issue with the
minus sign in property names then that is a bug, but I feel like I might
be missing something?

Tom



Re: Description list with " :: " in the tag.

2023-09-13 Thread Tom Alexander
I've written a patch (attached) with my proposed wording changes to the 
documentation, should I be starting another thread or does dropping it here 
work best? I do not have commit access so I'd need someone with such authority 
to do the last bit.

-- 
Tom Alexander
From 20addaa5ab7d4e9420ade1125c2a337345ecdd31 Mon Sep 17 00:00:00 2001
From: Tom Alexander 
Date: Wed, 13 Sep 2023 18:19:05 -0400
Subject: [PATCH] org-syntax.org: Fix definition of description list tags.

Description lists support objects in their tags and they support the substring " :: ".
---
 org-syntax.org | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/org-syntax.org b/org-syntax.org
index 123fc232..3046e26c 100644
--- a/org-syntax.org
+++ b/org-syntax.org
@@ -470,9 +470,10 @@ BULLET COUNTER-SET CHECK-BOX TAG CONTENTS
 + CHECK-BOX (optional) :: A single whitespace character, an =X=
   character, or a hyphen enclosed by square brackets (i.e. =[ ]=, =[X]=, or =[-]=).
 + TAG (optional) :: An instance of the pattern =TAG-TEXT ::= where
-  =TAG-TEXT= represents a string consisting of non-newline characters
-  that does not contain the substring =" :: "= (two colons surrounded by
-  whitespace, without the quotes).
+  =TAG-TEXT= is one of more objects from the standard set so long as
+  they do not contain a newline character, until the last occurrence
+  of the substring =" :: "= (two colons surrounded by whitespace,
+  without the quotes).
 + CONTENTS (optional) :: A collection of zero or more elements, ending
   at the first instance of one of the following:
   - The next item.
-- 
2.42.0



Description list with " :: " in the tag.

2023-09-08 Thread Tom Alexander
Emacs version: 29.1
Org-mode version: 163bafb43dcc2bc94a2c7ccaa77d3d1dd488f1af

Found a conflict between the documentation and the parser behavior. The 
org-mode documentation[1] for description list items says that TAG '[...] does 
not contain the substring " :: "'

Using this sample document, I have created a plain list item with a tag that 
contains that substring by wrapping it in a verbatim block:
```
- =foo :: bar= :: baz
```

Which parses to:
```
(org-data
 (:standard-properties
  [1 1 1 23 23 0 nil org-data nil nil nil 3 23 nil # nil nil 
nil]
  :path nil :CATEGORY nil)
 (section
  (:standard-properties
   [1 1 1 23 23 0 nil first-section nil nil nil 1 23 nil # 
nil nil #0])
  (plain-list
   (:standard-properties
[1 1 1 23 23 0 nil top-comment nil nil nil nil nil nil # 
nil
   ((1 0 "- " nil nil "=foo :: bar=" 23))
   #1]
:type descriptive)
   (item
(:standard-properties
 [1 1 19 23 23 0
(:tag)
item nil nil nil nil nil nil # nil
((1 0 "- " nil nil "=foo :: bar=" 23))
#2]
 :bullet "- " :checkbox nil :counter nil :pre-blank 0 :tag
 ((verbatim
   (:standard-properties
[3 nil nil nil 15 0 nil nil nil nil nil nil nil nil # 
nil nil #3]
:value "foo :: bar"
(paragraph
 (:standard-properties
  [19 19 19 23 23 0 nil nil nil nil nil nil nil nil # nil 
nil #3])
 #("baz\n" 0 4
   (:parent #4)))
```

It seems that "TAG-TEXT" is not just text but it can include objects and those 
objects can include the substring " :: ".

[1] https://orgmode.org/worg/org-syntax.html#Items

--
Tom Alexander




Document-level properties incorrect and/or missing based on preceding blank lines and/or comments

2023-10-11 Thread Tom Alexander
Emacs version: Emacs 29.1
Org-mode version: e1569918cc94253650781e83a09695739c93352f  (latest in git)

The org-mode syntax document[1] says that property drawers can exist in the 
zeroth section with the format:
```
BEGINNING-OF-FILE
BLANK-LINES
COMMENT
PROPERTYDRAWER
```

Using this test document:
```
:PROPERTIES:
:FOO:bar
:END:
```

I correctly get the foo property in the top-level org-data
```
(org-data
(:standard-properties
  [1 1 1 33 33 0 nil org-data nil nil nil 32 33 nil # nil nil 
nil]
  :path nil :FOO "bar" :CATEGORY nil)
```

But now there are two separate issues:

### Issue 1

Putting a comment before it makes the value for the foo property incorrect 
(seems to be grabbing an earlier string slice):
```
# baz
:PROPERTIES:
:FOO:bar
:END:
```

```
(org-data
(:standard-properties
  [1 1 1 39 39 0 nil org-data nil nil nil 38 39 nil # nil nil 
nil]
  :path nil :FOO "O: " :CATEGORY nil)
```

Interestingly, looking farther down the AST, the value for foo is properly set 
in the node-property, just not the org-data:
```
(node-property
(:standard-properties
 [20 20 nil nil 33 0 nil node-property nil nil nil nil nil nil # nil nil #2]
 :key "FOO" :value "bar"))
```

### Issue 2

Putting any blank lines before it makes the foo property not appear in org-data 
at all
```

:PROPERTIES:
:FOO:bar
:END:
```

```
(org-data
(:standard-properties
  [1 1 2 34 34 0 nil org-data nil nil nil 4 34 nil # nil nil 
nil]
  :path nil :CATEGORY nil)
```

Looking farther down the AST it seems the property-drawer became a regular 
drawer


[1] https://orgmode.org/worg/org-syntax.html#Property_Drawers

--
Tom Alexander
pgp: https://fizz.buzz/pgp.asc



Re: Inconsistent text markup handling when double-nesting markers

2023-10-11 Thread Tom Alexander
> Fixed, on main.

Thanks!

--
Tom Alexander
pgp: https://fizz.buzz/pgp.asc



Incorrect quantity of en-spaces

2023-10-16 Thread Tom Alexander
The org-mode syntax document describes entities as:
> \NAME POST
> \NAME{}
> Where NAME and POST are not separated by a whitespace character.

and POST is defined as:
> Either the end of line or a non-alphabetic character.

So using the test document:
```
\_   Foo
```
(a backslash, underscore, three spaces, and then the word Foo)

I would expect to get only 2 en-spaces but I am getting 3. Looking at 
org-entities, an underscore with 2 spaces gets 2 en-spaces, whereas an 
underscore with 3 spaces gets 3 en-spaces, but if we match all 3 spaces as NAME 
then POST becomes invalid because "F" is neither the end of the line nor a 
non-alphabetic character, so we can only match the first two spaces as NAME.

emacs version: 29.1
org-mode version: 9bbc21df84d507e568a3ebd17e105cdb9e163784 (latest in git)

--
Tom Alexander
pgp: https://fizz.buzz/pgp.asc



Re: Clock becomes a paragraph by prefixing with not-really-affiliated-keyword

2023-10-16 Thread Tom Alexander
Thanks!

--
Tom Alexander
pgp: https://fizz.buzz/pgp.asc



Re: Comments following not-really-affiliated keywords are becoming paragraphs

2023-10-16 Thread Tom Alexander
Thanks!

--
Tom Alexander
pgp: https://fizz.buzz/pgp.asc



Re: Fixed width areas not allowing tab after leading colon.

2023-09-19 Thread Tom Alexander
Thanks!

-- 
Tom Alexander

On Sun, Sep 17, 2023, at 5:48 AM, Ihor Radchenko wrote:
> "Tom Alexander"  writes:
>
>> The documentation for fixed width areas states: A “fixed-width line” starts 
>> with a colon character (:) and either a whitespace character or the 
>> immediate end of the line. 
>> ...
>> Fixed-width area documentation: 
>> https://orgmode.org/worg/org-syntax.html#Fixed_Width_Areas
>
> org-syntax.html is not accurate here. The parser only allows ": " (colon
> followed by space) and no other variant.
>
> Fixed, on master.
> https://git.sr.ht/~bzg/worg/commit/a42f57ac
>
>> This happens in a document in worg: 
>> https://git.sr.ht/~bzg/worg/tree/74e80b0f7600801b1d1594542602394c085cc2f9/item/org-contrib/org-bom.org#L499
>
> Fixed, on master.
> https://git.sr.ht/~bzg/worg/commit/0c8d5679
>
> -- 
> Ihor Radchenko // yantar92,
> Org mode contributor,
> Learn more about Org mode at <https://orgmode.org/>.
> Support Org development at <https://liberapay.com/org-mode>,
> or support my work at <https://liberapay.com/yantar92>



Re: [PATCH] Re: Description list with " :: " in the tag.

2023-09-19 Thread Tom Alexander
Sorry for the delay, I've been busy in the IRLs. I've updated the patch to 
reflect that the parser grabs the text before the last " :: " and then parses 
it as objects. The new patch is attached.

-- 
Tom Alexander

On Thu, Sep 14, 2023, at 7:24 AM, Ihor Radchenko wrote:
> "Tom Alexander"  writes:
>
>> I've written a patch (attached) with my proposed wording changes to
>> the documentation, should I be starting another thread or does
>> dropping it here work best?
>
> You can just modify subject with [PATCH], as I did.
>
>> ... I do not have commit access so I'd need
>> someone with such authority to do the last bit.
>
> Sure.
>
>> +  =TAG-TEXT= is one of more objects from the standard set so long as
>> +  they do not contain a newline character, until the last occurrence
>> +  of the substring =" :: "= (two colons surrounded by whitespace,
>> +  without the quotes).
>
> It does not fully represent what is going on - Org parser is top-down
> and does not parse objects before it is done parsing the descriptive
> list item. So,
>
> - *foo :: bar* does not actually contain bold markup
>
> Rather it is "* foo" tag + "bar* does not actually contain bold markup" 
> description.
>
> What happens is that the parser splits the first line of the item by the
> last " :: " and only then proceeds with parsing the tag and description
> using standard set of objects:
>
> - <> :: 
>
> -- 
> Ihor Radchenko // yantar92,
> Org mode contributor,
> Learn more about Org mode at <https://orgmode.org/>.
> Support Org development at <https://liberapay.com/org-mode>,
> or support my work at <https://liberapay.com/yantar92>From c8812bf7d81dc824d8ecf2c03368f58884773ddf Mon Sep 17 00:00:00 2001
From: Tom Alexander 
Date: Wed, 13 Sep 2023 18:19:05 -0400
Subject: [PATCH] org-syntax.org: Fix definition of description list tags.

Description lists support objects in their tags and they support the substring " :: ".
---
 org-syntax.org | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/org-syntax.org b/org-syntax.org
index 123fc232..fc5e9a37 100644
--- a/org-syntax.org
+++ b/org-syntax.org
@@ -470,9 +470,10 @@ BULLET COUNTER-SET CHECK-BOX TAG CONTENTS
 + CHECK-BOX (optional) :: A single whitespace character, an =X=
   character, or a hyphen enclosed by square brackets (i.e. =[ ]=, =[X]=, or =[-]=).
 + TAG (optional) :: An instance of the pattern =TAG-TEXT ::= where
-  =TAG-TEXT= represents a string consisting of non-newline characters
-  that does not contain the substring =" :: "= (two colons surrounded by
-  whitespace, without the quotes).
+  =TAG-TEXT= is the text up until the last occurrence of of the
+  substring =" :: "= (two colons surrounded by whitespace, without the
+  quotes) on that line. =TAG-TEXT= is then parsed with the standard
+  set of objects.
 + CONTENTS (optional) :: A collection of zero or more elements, ending
   at the first instance of one of the following:
   - The next item.
-- 
2.42.0



Subscript with parenthesis

2023-09-21 Thread Tom Alexander
The org-mode documentation[1] states that the SCRIPT portion of the 
subscript/superscript is either an asterisk, the standard set of objects 
wrapped in balanced curly braces, or an optional sign followed by "Either the 
empty string, or a string consisting of any number of alphanumeric characters, 
commas, backslashes, and dots"

But I'm seeing the following test document parse as containing a subscript 
despite using parenthesis which I do not think matches any of the above 
criteria:
```
foo_(bar)
```

[1] https://orgmode.org/worg/org-syntax.html#Subscript_and_Superscript

--
Tom Alexander




Consecutive plain list items of different types

2023-09-21 Thread Tom Alexander
The org-mode documentation[1] states for plain lists that:
> List types are mutually exclusive at the same level of indentation, if both 
> types are present consecutively then they parse as separate lists.

first a minor nit-pick that "both" is probably not the correct word here since 
there are 3 types of lists, not two (unordered, ordered, and descriptive). I'd 
go with "multiple" instead IMO.

but more importantly, based on that description I would expect the following 
test document to parse into three separate plain lists, but it parses as a 
single plain list with 3 items:

```
1. foo
- bar
- lorem :: ipsum
```

[1] https://orgmode.org/worg/org-syntax.html#Plain_Lists

--
Tom Alexander
pgp: https://fizz.buzz/pgp.asc



Re: Subscript with parenthesis

2023-09-21 Thread Tom Alexander
Some additional things I'm noticing:

- when using parenthesis, :use-brackets-p is nil, so they're not equivalent to 
curly braces.
- it does not support objects inside the parenthesis, just plain text, which 
again means they're not equivalent to braces.
- it, however, seems to require that the parenthesis are balanced because this 
test document does NOT contain a subscript:
```
foo_(b(ar)
```
which is closer to the curly braces requirement since that seems to be the only 
part of the subscript/superscript documentation that mentions needing balance.

-- 
Tom Alexander



[PATCH] Add backslash to list of POST characters for text markup

2023-09-21 Thread Tom Alexander
Backslash appears to be supported. To test I used the following test document:
```
foo ~bar~\& baz
```

This happens in a document in worg: 
https://git.sr.ht/~bzg/worg/tree/ae64e1a54185232d4ebdcab174d8d4319ffd564d/org-release-notes.org#L

The ampersand was chosen for the test document since that is not a supported 
POST character, to make sure backslash was not simply escaping the next 
character.

In the documentation I wrote out the word "backslash" in parenthesis to 
disambiguate between backslash and escaping the following comma.

Patch is attached.

--
Tom Alexander
pgp: https://fizz.buzz/pgp.ascFrom 098434680b5e3942acc00684a47389f2cdab6208 Mon Sep 17 00:00:00 2001
From: Tom Alexander 
Date: Thu, 21 Sep 2023 21:14:33 -0400
Subject: [PATCH] Add backslash to list of POST characters for text markup.

Backslash appears to be supported. To test I used the following test document:
```
foo ~bar~\& baz
```
The ampersand was chosen since that is not a supported POST character, to make sure backslash was not simply escaping the next character.

In the documentation I wrote out the word "backslash" in parenthesis to disambiguate between backslash and escaping the following comma.
---
 org-syntax.org | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/org-syntax.org b/org-syntax.org
index c5299741..8f0f9b0c 100644
--- a/org-syntax.org
+++ b/org-syntax.org
@@ -249,9 +249,9 @@ discarded.  This also applies to single-line elements.
 
 :This paragraph will not contain
 :a long sequence of spaces before "a".
-: 
+:
 :This paragraph does not have leading spaces according to the parser.
-: 
+:
 :#+begin_src emacs-lisp
 :  (+ 1 2)
 :#+end_src
@@ -1742,7 +1742,7 @@ whitespace characters.
   verbatim) or a series of objects from the standard set. In both
   cases, CONTENTS may not begin or end with whitespace.
 + [[#Special_Tokens][POST]] :: Either a whitespace character, =-=, =.=, =,=, =;=, =:=, =!=, =?=, ='=, =)=, =}=,
-  =[=, ="=, or the end of a line.
+  =[=, ="=, =\= (backslash), or the end of a line.
 
 *Examples*
 
-- 
2.42.0



COUNTER-SET for alphabetical ordered lists ignored for utf-8 exporter

2023-09-29 Thread Tom Alexander
It seems that COUNTER-SET[1] is not being honored when exporting to utf-8 for 
alphabetical lists even though it is honored for numeric lists. When exporting 
to html, COUNTER-SET is honored for both.

Test document:
```
# An ordered list starting at 3
1. [@3] foo


# An ordered list starting at 1
m. bar


# An ordered list starting at 11
m. [@k] baz
```

Launching emacs with: (Setting org-list-allow-alphabetical is necessary or else 
the alphabetical lists will become paragraphs)
```
emacs -q --eval '(setq org-list-allow-alphabetical t)' /tmp/test.org
```

When exporting to html you get (edited to remove whitespace for clarity):
```
foo
bar
baz
```

But when exporting to utf-8 you get: (whitespace removed again)
```
3. foo
m. bar
m. baz
```

Whereas I would expect:  (whitespace removed again)
```
3. foo
m. bar
k. baz
```

On a slightly related note: it seems the COUNTER-SET[1] allows single-letter 
values even when org-list-allow-alphabetical is nil. I don't think that is 
going to hurt anyone but I figured I should mention it in case its a bug (test 
doc: `1. [@k] foo` is a plain list starting at 11 even when 
org-list-allow-alphabetical is nil).

[1] https://orgmode.org/worg/org-syntax.html#Items

Emacs 29.1, Org-mode version 9.7-pre (release_9.6.8-781-gc70354)

--
Tom Alexander
pgp: https://fizz.buzz/pgp.asc



Re: Subscript with parenthesis

2023-09-29 Thread Tom Alexander
> Not true. I tried
>
> b^(*asd*) and bold inside superscript does get parsed.

Ah thanks for double-checking! You're right, that is getting parsed. Not sure 
what test document I was using to make me think objects didn't work inside the 
parenthesis.

--
Tom Alexander
pgp: https://fizz.buzz/pgp.asc



Lesser blocks allowing unescaped lines

2023-09-29 Thread Tom Alexander
This happens in worg at: 
https://git.sr.ht/~bzg/worg/tree/ba6cda890f200d428a5d68e819eef15b5306055f/exporters/ox-docstrings.org#L2490

The documentation for lesser blocks[1] states:
> Lines beginning with an asterisk or `#+` must be quoted by a comma (`,*`, 
> `,#+`).

However, the following test document parses as a lesser block despite 
containing a line starting with an unescaped #+:
```
#+CATEGORY: foo
#+begin_src text
#+CATEGORY: bar
#+end_src
```

which parses as:
```
(org-data
(:standard-properties
  [1 1 1 60 60 0 nil org-data nil nil nil 3 60 nil # nil nil 
nil]
  :path nil :CATEGORY "foo")
(section
  (:standard-properties
   [1 1 1 60 60 0 nil first-section nil nil nil 1 60 nil # 
nil nil #0])
  (keyword
   (:standard-properties
[1 1 nil nil 17 0 nil top-comment nil nil nil nil nil nil # nil nil #1]
:key "CATEGORY" :value "foo"))
  (src-block
   (:standard-properties
[17 17 nil nil 60 0 nil nil nil nil nil nil nil nil # nil 
nil #1]
:language "text" :switches nil :parameters nil :number-lines nil 
:preserve-indent nil :retain-labels t :use-labels t :label-fmt nil :value 
"#+CATEGORY: bar\n"
```

whereas I would expect this to be
```
(section
  (keyword :key "CATEGORY" :value "foo")
  (paragraph "#+begin_src text")
  (keyword :key "CATEGORY" :value "bar")
  (paragraph "#+end_src")
)
```

This test document shows that lines with an unescaped "*" do break up the 
lesser block:
```
* foo
#+begin_src text
* bar
#+end_src
```


[1] https://orgmode.org/worg/org-syntax.html#Blocks

--
Tom Alexander
pgp: https://fizz.buzz/pgp.asc



Re: Extra paragraphs incorrectly spawning when ":end:" appears.

2023-09-30 Thread Tom Alexander
Same problem occurs with this sample document:
```
foo
#+BEGIN: bar
baz
```

which parses as:
```
(section
  (paragraph "foo\n")
  (paragraph "#+BEGIN: bar\nbaz\n)
)
```

again, no blank lines and no non-paragraph elements but the single paragraph 
got split in two.

--
Tom Alexander
pgp: https://fizz.buzz/pgp.asc



Extra paragraphs incorrectly spawning when ":end:" appears.

2023-09-30 Thread Tom Alexander
This test document has 1 paragraph:
```
foo
bar
baz
```
which parses as:
```
(section
  (paragraph "foo\nbar\nbaz\n")
)
```

This test document should have 1 paragraph but org-mode is parsing it as 2:
```
foo
:end:
baz
```

which parses as:
```
(section
  (paragraph "foo\n")
  (paragraph ":end:\nbaz\n")
)
```

The paragraph documentation[1] states that:
> Empty lines and other elements end paragraphs.

But the document contains no empty lines and we can see in the output that it 
only contains paragraphs.

[1] https://orgmode.org/worg/org-syntax.html#Paragraphs

--
Tom Alexander
pgp: https://fizz.buzz/pgp.asc


[PATCH] Clarify that REST is not supported on the start TIME in a time-range timestamp.

2023-10-02 Thread Tom Alexander
If REST is included in the first TIME on a time-range timestamp then the entire 
timestamp becomes a single range-less timestamp. To test I used the following 
test document:
```
[1970-01-01 Thu 8:15-13:15foo]
[1970-01-01 Thu 8:15foo-13:15]
```

The first line parses as a timerange from 8:15-13:15.
The second line parses as a single timestamp at 8:15.

--
Tom Alexander
pgp: https://fizz.buzz/pgp.asc
From b1114e983d961d48e1d837b8d2ad209a976a5417 Mon Sep 17 00:00:00 2001
From: Tom Alexander 
Date: Mon, 2 Oct 2023 17:35:28 -0400
Subject: [PATCH] * org-syntax.org (Timestamps): Clarify that REST is not
 supported on the start TIME in a time-range timestamp.

If REST is included in the first TIME on a time-range timestamp then the entire timestamp becomes a single range-less timestamp. To test I used the following test document:
```
[1970-01-01 Thu 8:15-13:15foo]
[1970-01-01 Thu 8:15foo-13:15]
```

The first line parses as a timerange from 8:15-13:15.
The second line parses as a single timestamp at 8:15.
---
 org-syntax.org | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/org-syntax.org b/org-syntax.org
index c2061431..0c326ba8 100644
--- a/org-syntax.org
+++ b/org-syntax.org
@@ -1686,9 +1686,10 @@ -MM-DD DAYNAME
   - DAYNAME (optional) :: A string consisting of non-whitespace
 characters except =+=, =-=, =]=, =>=, a digit, or =\n=.
 + TIME (optional) :: An instance of the pattern =H:MMREST= where =H=
-  represents a one to two digit number (and can start with =0=), and =M=
-  represents a single digit.  =REST= can contain anything but =\n= or
-  closing bracket.
+  represents a one to two digit number (and can start with =0=), and
+  =M= represents a single digit.  =REST= can contain anything but =\n=
+  or closing bracket. =REST= cannot exist on the start TIME in a
+  time-range timestamp (the patterns with =TIME-TIME=).
 + REPEATER-OR-DELAY (optional) :: An instance of the following pattern:
   #+begin_example
 MARK VALUE UNIT
-- 
2.42.0



Re: [PATCH] Clarify that REST is not supported on the start TIME in a time-range timestamp.

2023-10-02 Thread Tom Alexander
Potentially related, org-mode is accepting this malformed timestamp from[1]:
```
<2016-02-14 Sun ++y>
```

The org-mode documentation[2] only includes REST with TIME, defining TIME as 
"H:MMREST". The above does not have any TIME but it accepts the timestamp 
anyway:
```
(timestamp
  :type active
  :range-type nil
  :raw-value "<2016-02-14 Sun ++y>"
  :year-start 2016
  :month-start 2
  :day-start 14
  :hour-start nil
  :minute-start nil
  :year-end 2016
  :month-end 2
  :day-end 14
  :hour-end nil
  :minute-end nil
)
```

Perhaps that grammar is wrong and REST needs to be separated from TIME?

[1] 
https://github.com/howardabrams/pdx-emacs-hackers/blob/bfb7bd640fdf0ce3def21f9fc591ed35d776b26d/workshops/org-mode-gtd-feature-demo.org#L183
[2] https://orgmode.org/worg/org-syntax.html#Timestamps

--
Tom Alexander
pgp: https://fizz.buzz/pgp.asc



Re: Extra paragraphs incorrectly spawning when ":end:" appears.

2023-10-02 Thread Tom Alexander
Hmm thanks, that makes sense. I guess a post-processing step to merge adjacent 
paragraphs wouldn't work either since that wouldn't stitch together objects 
like the bold in this test document without re-parsing the entire paragraph:
```
foo *bar
:end:
baz*
```

oh well 路

--
Tom Alexander
pgp: https://fizz.buzz/pgp.asc



Re: [PATCH] Add backslash to list of POST characters for text markup

2023-09-29 Thread Tom Alexander
Thanks!

--
Tom Alexander
pgp: https://fizz.buzz/pgp.asc

On Fri, Sep 22, 2023, at 5:29 AM, Ihor Radchenko wrote:
> "Tom Alexander"  writes:
>
>> Backslash appears to be supported. To test I used the following test 
>> document:
>> ```
>> foo ~bar~\& baz
>> ```
>
> Thanks!
> You are right.
> Applied, onto master, with minor amendments to the commit message.
> https://git.sr.ht/~bzg/worg/commit/ba6cda89
>
> -- 
> Ihor Radchenko // yantar92,
> Org mode contributor,
> Learn more about Org mode at <https://orgmode.org/>.
> Support Org development at <https://liberapay.com/org-mode>,
> or support my work at <https://liberapay.com/yantar92>



Comments following not-really-affiliated keywords are becoming paragraphs

2023-10-11 Thread Tom Alexander
Emacs version: 29.1
Org-mode version: e1569918cc94253650781e83a09695739c93352f (latest in git)

Test document:
```
#+CAPTION: foo
# bar
```

This parses as a paragraph with the caption of foo and the body of "# bar" when 
it should parse as a regular keyword followed by a comment.

Relevant org-syntax[1] bit:

> a keyword with the same KEY as an affiliated keyword may occur so long as it 
> is not immediately preceding a valid element that can be affiliated. For 
> example, an instance of #+caption: hi followed by a blank line will be parsed 
> as a keyword, not an affiliated keyword.


[1] https://orgmode.org/worg/org-syntax.html#Keywords

--
Tom Alexander
pgp: https://fizz.buzz/pgp.asc



Org-mode starting with 37d6bde27 errors out parsing org-mode/testing/examples/pub/a.org

2023-10-11 Thread Tom Alexander
Steps to reproduce:
 1. Build emacs 29.1
 2. Build org-mode with revision 37d6bde27fe228cdadcb5cdaa09287872a508777
 3. Run the following:
```
emacs -q --no-site-file --no-splash --batch --eval "(progn
 (require 'org)
 (setq vc-handled-backends nil)
 (find-file-read-only \"org-mode/testing/examples/pub/a.org\")
 (org-mode)
 (message \"%s\" (pp-to-string (org-element-parse-buffer)))
)"
```

I've attached a Dockerfile that reproduces the issue. Just throw that in a 
directory and run `docker build -t temp .` to see it fail. Change the `ARG 
ORG_VERSION=` line to `ac108a3ac1b332bf27ff2984a9cf26af3744185d` to see it 
succeed.

Error message:
```
File mode specification error: (void-function org-export--list-bound-variables)

Error: void-function (org-export--list-bound-variables)
  mapbacktrace(#f(compiled-function (evald func args flags) #))
  debug-early-backtrace()
  debug-early(error (void-function org-export--list-bound-variables))
  org-export--list-bound-variables()
  org-element--generate-copy-script(# :copy-unreadable 
do-not-check :drop-visibility t :drop-narrowing t :drop-contents t :drop-locals 
nil)
  org-element-copy-buffer(:to-buffer # :drop-visibility t 
:drop-narrowing t :drop-contents t :drop-locals nil)
  org-element-parse-secondary-string("<2014-03-04 Tue>" (bold citation code 
entity export-snippet inline-babel-call inline-src-block italic line-break 
latex-fragment link macro radio-target statistics-cookie strike-through 
subscript superscript target timestamp underline verbatim))
  org-macro--find-date()
  org-macro--collect-macros()
  org-macro-initialize-templates()
  org-mode()
  (progn (require 'org) (setq vc-handled-backends nil) (find-file-read-only 
"/input/home/talexander/git/org-mode/testing/examples/pub/a.org") (org-mode) 
(message "%s" (pp-to-string (org-element-parse-buffer
  command-line-1(("--no-splash" "--eval" "(progn\n (require 'org)\n 
(setq vc-handled-backends nil)\n (find-file-read-only 
\"/input/home/talexander/git/org-mode/testing/examples/pub/a.org\")\n 
(org-mode)\n (message \"%s\" (pp-to-string 
(org-element-parse-buffer)))\n)"))
  command-line()
  normal-top-level()
Symbol’s function definition is void: org-export--list-bound-variables
```

--
Tom Alexander
pgp: https://fizz.buzz/pgp.asc


Dockerfile
Description: Binary data


Keyword becoming a paragraph based on optval

2023-10-12 Thread Tom Alexander
Emacs version: 29.1
Org-mode version: f3de4c3e041e0ea825b5b512dc0db37c78b7909e (latest in git)

This test document parses as a keyword:
```
#+CAPTION[*foo*]: baz
```

but this test document parses as a paragraph:
```
#+CAPTION[*foo* bar]: baz
```

--
Tom Alexander
pgp: https://fizz.buzz/pgp.asc



Clock becomes a paragraph by prefixing with not-really-affiliated-keyword

2023-10-12 Thread Tom Alexander
This test document correctly parses as a clock:
```
CLOCK: [2023-04-21 Fri 19:43]
```

This test document incorrectly parses as a paragraph:
```
#+NAME: foo
CLOCK: [2023-04-21 Fri 19:43]
```

--
Tom Alexander
pgp: https://fizz.buzz/pgp.asc



Re: Keyword becoming a paragraph based on optval

2023-10-12 Thread Tom Alexander
> Note that _affiliated keyword_ has an optional form of

Ah, that was what I was missing, thanks!

--
Tom Alexander
pgp: https://fizz.buzz/pgp.asc



Re: Clarify that REST is not supported on the start TIME in a time-range timestamp.

2023-10-06 Thread Tom Alexander
> As for the problem with REST you raised, I am inclined to remove it from
> syntax doc for the time being - it only creates more confusion,
> unfortunately.

Makes sense, thanks. Is there anything we do to mark patches as rejected? I 
removed [PATCH] from the subject line.

--
Tom Alexander
pgp: https://fizz.buzz/pgp.asc



Re: Lesser blocks allowing unescaped lines

2023-10-06 Thread Tom Alexander
Thank you! Makes sense.

--
Tom Alexander
pgp: https://fizz.buzz/pgp.asc



Re: Incorrect quantity of en-spaces

2023-10-18 Thread Tom Alexander
> This appears to be a special case, not documented on org-syntax page.

Sounds good, thanks! 

--
Tom Alexander
pgp: https://fizz.buzz/pgp.asc



Inconsistent text markup handling when double-nesting markers

2023-10-09 Thread Tom Alexander
I used the following test document:
```
__foo__

**foo**
```

I'd expect the two to behave the same but the first one parses as:
```
(paragraph
  "_"
  (subscript "foo")
  "__"
  )
```

Whereas the second parses as:
```
(paragraph
  (bold
(bold
  "foo"
  )
)
  )
```

This pattern happens in worg at [2]

Looking at the description for text markup in the syntax document[1], I don't 
see any reason the first wouldn't be parsed as an underline:

1. PRE: valid because it is the beginning of a line
2. MARKER: valid underscore
3. CONTENTS: valid. Series of objects from standard set includes both subscript 
and text markup, so regardless of how we parse the interior, its valid. Also 
cannot begin or end with whitespace but there is no whitespace in the CONTENTS.
4. MARKER: valid underscore
5. POST: Only valid if we extend the underline to the 2nd underscore so it ends 
at the end of the line. But the 2nd line shows us that having copies of the 
marker inside the CONTENTS is fine so I see two possible expected parses of the 
CONTENTS:
4a. (underline "foo")
4b. ((subscript "foo") (plain-text "_"))

I also ran the following test document to further prove that having copies of 
the marker inside the CONTENTS is fine:
```
*foo*bar*
```
which parses as (bold "foo*bar")

So the only way the top line would fail to parse as an underline is if it 
matched the first closing underscore as closing the underline, but that would 
be invalid because underscore is not a valid POST character and invalid copies 
of the closing marker are ignored as proven by both "**foo**" and "*foo*bar*".


[1] https://orgmode.org/worg/org-syntax.html#Emphasis_Markers
[2] 
https://git.sr.ht/~bzg/worg/tree/ba6cda890f200d428a5d68e819eef15b5306055f/org-contrib/babel/intro.org#L117

--
Tom Alexander
pgp: https://fizz.buzz/pgp.asc



Re: COUNTER-SET for alphabetical ordered lists ignored for utf-8 exporter

2023-10-11 Thread Tom Alexander
Thanks!

> We aim to reduce config-dependent Org syntax in the long term.

Thats wonderful news! Sometimes this stuff can really surprise you. For 
example, the structure of the document created by running `echo "1. foo\n 
1.bar\n1.baz\n\t1.lorem"` changes based on the user's **tab-width**!!

If tab-width is less than 8 then this is:
```text
1. foo
  1. bar
1. baz
  2. lorem
```

If tab-width is 8 then this is:
```text
1. foo
  1. bar
1. baz
2. lorem
```

and if tab-width is greater than 8 this is:
```text
1. foo
  1. bar
1. baz
  1. lorem
```

Absolute madness! I always considered tab-width to be a personal aesthetic 
choice and not something that would functionally change how documents other 
people wrote will be parsed.

Idk if its been discussed, but personally if I were given dictatorship over 
org-mode I would take all of these emacs variables that are defined outside of 
the document, and instead of having them influence org-mode directly, I would 
*only* use them to pre-populate values for in-buffer settings templates.

For example, if a user had set `org-odd-levels-only` then I wouldn't have that 
impact ANY existing document they open, but if they open a new document then I 
would have it auto-insert `#+STARTUP: odd` at the top of the fresh document.

Otherwise it seems like org-mode is unsuitable for multi-person collaboration 
without dictating the contents of everyone's `.emacs` file.

--
Tom Alexander
pgp: https://fizz.buzz/pgp.asc



Re: Clarification on blank lines following list items

2023-08-20 Thread Tom Alexander
Thank you so much for explaining all of that! There is some good information 
there I was missing. I think the most important bit I was missing is the 
post-blank stuff. I was only looking at begin->end but I think digging into the 
post-blank is what makes this consistent.

I've got 2 separate questions:

1. Is the following statement true? "Two elements can count the same character 
in their post-blank?"
I am seeing dual-ownership of the post-blank in the examples below, but at the 
same time if I put a plain-list inside a footnote definition, the footnote 
definition ends up with sole custody of the post-blank.

2. I'm still not sure about some behavior I'm seeing. I think it would be 
easiest to see if we focus on exactly 1 blank line:

```
1. bar
2. baz
   < this blank line here
ipsum
```

In this example, the blank line gets counted in the post-blank for the 
plain-list but not for the item:
```
plain-list: post-blank 1 | begin 1 end 16 | contents-begin 1 contents-end 15
item: post-blank 0 | begin 1 end 8 | contents-begin 4 contents-end 8
paragraph: post-blank 0 | begin 4 end 8 | contents-begin 4 contents-end 
8
item: post-blank 0 | begin 8 end 15 | contents-begin 11 contents-end 15
paragraph: post-blank 0 | begin 11 end 15 | contents-begin 11 
contents-end 15
paragraph: post-blank 0 | begin 16 end 22 | contents-begin 16 contents-end 22
```

but if we take that plain-list and nest it inside another plain-list:
```
1. foo
   1. bar
   2. baz
   < this blank line here
2. lorem
ipsum
```

The blank line gets counted as a post-blank for both the item "foo" and the 
item "baz":
```
plain-list: post-blank 0 | begin 1 end 38 | contents-begin 1 contents-end 38
item: post-blank 1 | begin 1 end 29 | contents-begin 4 contents-end 28
paragraph: post-blank 0 | begin 4 end 8 | contents-begin 4 contents-end 
8
plain-list: post-blank 0 | begin 8 end 29 | contents-begin 8 
contents-end 29
item: post-blank 0 | begin 8 end 18 | contents-begin 14 
contents-end 18
paragraph: post-blank 0 | begin 14 end 18 | contents-begin 14 
contents-end 18
item: post-blank 1 | begin 18 end 29 | contents-begin 24 
contents-end 28
paragraph: post-blank 0 | begin 24 end 28 | contents-begin 24 
contents-end 28
item: post-blank 0 | begin 29 end 38 | contents-begin 32 contents-end 38
paragraph: post-blank 0 | begin 32 end 38 | contents-begin 32 
contents-end 38
paragraph: post-blank 0 | begin 38 end 44 | contents-begin 38 contents-end 44
```

Meaning the post-blank did this movement:
```
plain-list: post-blank 0
item: post-blank 1   <---<<<-\
paragraph: post-blank 0  |
plain-list: post-blank 0 >>--|
item: post-blank 0   |
paragraph: post-blank 0  |
item: post-blank 1   <---<---/
paragraph: post-blank 0
item: post-blank 0
paragraph: post-blank 0
paragraph: post-blank 0
```


Question ---> So why is the item "baz" gaining a post-blank instead of the 
inner plain-list (bar baz) keeping that post-blank?

I would expect it to instead be:
```
plain-list: post-blank 0
item: post-blank 1
paragraph: post-blank 0
here -> plain-list: post-blank 1
item: post-blank 0
paragraph: post-blank 0
not here -> item: post-blank 0
paragraph: post-blank 0
item: post-blank 0
paragraph: post-blank 0
paragraph: post-blank 0
```

I re-did both test cases using greater blocks and lesser blocks instead of 
paragraphs to make sure it wasn't that historical exception at the end of your 
email, and the post-blank behavior was exactly the same.


-- 
Tom Alexander



Re: [BUG] inline src blocks in caption of not-inline src blocks do not execute

2023-08-16 Thread Tom Gillespie
Confirming fixed. Thanks!

PS A new issue arises however caused by
487f39efa68fa2d857f8d446d1c4b3a3b3e3f482,
which is that it is now confusing  to get the {{{results(=value=)}}}
macro without verbatim which is what :results drawer meant in
that context. I expect that change will break things for a number
of people beyond myself. A bit of reading the code revealed that
setting :wrap t makes it possible to get {{{results(value)}}} again,
but line 2647 [1] seems to indicate that drawer is an expected
and valid value for inline :results.


1.
   ((or (member "drawer" result-params)
;; Stay backward compatible with <7.9.2
(member "wrap" result-params))
(goto-char beg) (when (org-at-table-p) (org-cycle))
(funcall wrap ":results:" ":end:" 'no-escape nil
 "{{{results(" ")}}}"))



Re: [PATCH] ob-tangle.el: restore :tangle closure nil behavior

2023-08-16 Thread Tom Gillespie
> My confusion about you patch comes from the fact that
>
> #+begin_src emacs-lisp :tangle (if (= 1 1) "yes")
> 2
> #+end_src
>
> works just fine on main.

It appears to work fine on main, but that is because
what is actually happening behind the scenes is that in the test
(unless (or (string= src-tfile "no") ...) ...) the actual comparison is
(string= "(if (= 1 1) \"yes\")" "no") which appears to work, but is
not comparing the result of the closure, only its string value.

> I admit that I don't fully understand your use case.

I want to use a closure to conditionally control whether a block will tangle.
If I hardcode :tangle no, then :var x=(error "oops") will not evaluate. The
(error "oops") is a placeholder for a number of things that will result in
an error if the condition for :tangle (when condition "file-name") is not
satisfied.

> Something like (org-babel-get-heading-arg :tangle info/params)

I need to go to bed, because I definitely started on an implementation
of that I forgot about it as a potential solution. Yes, this seems like
a better and clearer way to go about it.

> May you please elaborate?

Disregard, your suggestion clarified what you meant, and in
that case, yes we can consolidate.



Re: [BUG] inline src blocks in caption of not-inline src blocks do not execute

2023-08-16 Thread Tom Gillespie
> It was a slip when the patch was applied.
> See the table of :results params vs. expected output that Nicolas
> provided in
> https://list.orgmode.org/orgmode/87zjbqrapy@nicolasgoaziou.fr/
>
> Or maybe I miss something.
>
> May you please explain more about {{{results(=value=)}}} problem?
> Isn't it sufficient to do src_elisp[:results verbatim]{'value} 
> {{{results(=value=)}}}?

The issue is the opposite I think. Currently the default value (i.e. absent)
for :results does not produce {{{results(value)}}} as suggested, and instead
producers {{{results(=value=)}}}. This means that without :results drawer
there isn't an obvious way to get {{{results(value)}}} because you can't e.g.
use [:results default], or if a user overrides the default value for
inline header
args at file level then they have no way to reset to the default.

It looks like there used to be an option [:results wrap] which was deprecated
a _very_ long time ago. [:results drawer] replaced that, and while there is
some confusion about the name (because there is no actual drawer in an
inline result) the behavior was meant to replace the old :results wrap behavior
where the name does make sense since {{{results(value)}}} do "wrap" the value.

I think that covers it, but let me know if something doesn't make sense. Best,
Tom



Re: [PATCH] ob-tangle.el: restore :tangle closure nil behavior

2023-08-16 Thread Tom Gillespie
On Wed, Aug 16, 2023 at 2:09 AM Ihor Radchenko  wrote:
>
> Tom Gillespie  writes:
>
> > Subject: [PATCH] ob-tangle.el: restore :tangle closure evaluation before 
> > eval
> >  info
> > This patch fixes a bug where header arguments like :tangle (or "no")
> > were treated as if they were tangling to a file named "(or \"no\")".
> > As a result, org-bable would call org-babel-get-src-block-info with
> > 'no-eval set to nil, causing parameters to be evaluated despite the
> > fact that when :tangle no or equivalent is set, the other parameters
> > should never be evaluated.
>
> What do you mean by "restore"? Were it evaluated in the past?
> May you please provide a reproducer?

Hrm. I think I may have mixed two commit lines. It is the case that
:tangle closures used to work, but you are right, the historical behavior
when tangling closures meant that all parameters were evaluated (tested
with the block below in 27 and 28).

#+begin_src elisp :var value=(error "oops") :tangle (or "no")
value
#+end_src

My use case is that I have blocks that I want to tangle that set :var
from e.g. the library of babel, which isn't always loaded, but which also
is not required if :tangle is no.

> > -(defun org-babel-tangle--unbracketed-link (params)
> > +(defun org-babel-tangle--unbracketed-link (params  
> > info-was-evaled)
>
> This is not acceptable. Taking care about evaluating INFO should be done
> in a single place instead of adding checks across the babel code. If we
> go the proposed way, I expect a number of bugs appearing when somebody
> forgets to change the eval check in some place.

I don't like the solution either. I see two potential alternatives.
1. change the structure of the info list to indicate whether it has
already been evaluated
2. always call org-babel-read on (cdr (assq :tangle params)) even
if it may already have been evaluated which can lead to some unexpected
and potentially nasty results.

I don't think we can consolidate evaluating parameters
into one place in the general case because there are
order dependencies where a setting in one param header
should mask others (as is the case here). In principle we
could consolidate them, but I think that would add significant
complexity because we would have to push all the logic for
handling whether a given ordering restriction applies inside
that location. e.g. if I have a block set :eval (if ev "yes" "no")
it would be bad form to evaluate the parameters before determining
whether the :eval closure evaluates to "yes" or "no". Should that
go inside org-process-params, or should it be handled locally
by e.g. org-babel-tangle and org-babel-execute-src-block separately?

Thoughts?



<    1   2   3   4   5   6   >