On Tuesday, 27 May 2025 18:44:19 UTC Martin Edström wrote:
> Well-explained! Thank you, Kristoffer :)
>
> On Mon, 26 May 2025 16:02:30 -0500, Kristoffer Balintona
<[email protected]> wrote:
> > On Mon, May 26, 2025 at 12:02 PM chris <[email protected]> wrote:
> > > Org-node seems very interesting! I noticed that your
> > > [parser.el](https://
> > > github.com/meedstrom/org-mem/blob/main/org-mem-parser.el) is only about
> > > 600
> > > lines long, whereas Org-mode’s parser seems larger and possibly more
> > > scattered? Are they roughly equivalent in scope/intent, or is your
> > > version
> > > focused on a different subset of Org features?
> >
> > Hi,
> >
> > I am not Martin, but I’ll share a bit about what I’ve gathered about the
> > project after having used org-node for a few months.
> >
> > As far as I can tell, the org-mem parser is a parser specially tailored
> > for a specific end, namely, speed. What sets org-node apart from
> > org-roam is that it does not need anything on-disk; it maintains hash
> > tables inside Emacs for all its data. (Additionally, and in line with
> > org-node’s mission for performance, it does not end up needing to load
> > org at all, since its parser is an implementation independent of it.) It
> > can get away with this because the parser is very fast and leverage’s
> > el-job’s[1] asynchronous processing of lists.
> >
> > Of course, the trade off for parsing speed is completeness: org-mem must
> > implement its own regexps to find the data it needs. Everything else is
> > ignored. So if org-mem wants to collect e.g. timestamp data, it must do
> > so without any help from org (as was recently implemented). Org also
> > does a lot to process things like org keywords in files. And, of course,
> > this approach is susceptible to mismatches with what org’s parser
> > actually recognizes since org-mem’s parser is bespoke.
> >
> > I’m guessing part of Martin’s motivation to ask his original question is
> > related to how tenable maintaining a parser independent from org is. It
> > would be much easier to rely on the definitive org parser if possible. And
> > if I would speculate further, I think what he has in mind is to store
> > the parse trees on disk and read from those (potentially caching those
> > on-disk parse trees if necessary) rather than the user’s files. This way,
> > performance is still fast since the user’s org files are already parsed
> > (which is the expensive part).
> >
> > Martin can chime in and share to correct me if I’m wrong.
Is the idea to memoize the output of `org-element-parse-buffer` in a file using
a change date or control sum to verify the content hasn't changed, so as to be
able to reuse that later, eliminating the need to parse the org-file again,
like in the minimal naive example below?
#+begin_src emacs-lisp :lexical t :wrap example :results raw
;; I use org-mode "83a55c6fe", the example might not work
;; with earlier version.
(let*
((org-content
(mapconcat (function identity)
'(
"# I'm a comment" nil
":PROPERTIES:"
":ID: 463e4d2b-65d7-40ea-ad2d-80abd9edbeff"
":special_property: cool special property"
":END:"
"#+title: cool title" nil
"* hello"
"hello" nil
)
"\n"))
(ast (with-temp-buffer
(insert org-content)
(org-mode)
(org-element-parse-buffer)))
(print-circle t)
(as-a-string (prin1-to-string ast))
(cleaned-string
(replace-regexp-in-string
"#<killed buffer>" "\"quux\"" as-a-string))
;; Here, you can save the string to a file.
;; Then, you can reuse the string to convert it back into an AST
;; without having to parse the org-file again.
(ast-out (car (read-from-string cleaned-string)))
(new-content (org-element-interpret-data ast-out)))
(princ new-content))
#+end_src
#+RESULTS:
#+begin_example
# I'm a comment
:PROPERTIES:
:ID: 463e4d2b-65d7-40ea-ad2d-80abd9edbeff
:special_property: cool special property
:END:
,#+title: cool title
,* hello
hello
#+end_example
Chris