Re: [O] [RFC] Simple cache mechanism for `org-element-at-point'
Hello, Eric Abrahamsen e...@ericabrahamsen.net writes: I wasn't expecting to report back anything meaningful on this issue unless I saw a bug or problem. I've done quite a bit of day-to-day using these patches -- editing, agenda stuff, and exporting -- with no noticeable ill effects... so there's my report! Thank you very much for testing it. I pushed it on master branch. Hopefully, the more functions use `org-element-at-point' and `org-element-context', the more it will be beneficial. Regards, -- Nicolas Goaziou
Re: [O] [RFC] Simple cache mechanism for `org-element-at-point'
Nicolas Goaziou n.goaz...@gmail.com writes: Here is a slight change to the second one, which will correctly reset cache when some variables are customized or when a buffer is refreshed (C-c C-c on a keyword). By the way, almost a month has passed since the first message in this thread. Is someone still testing, or reviewing, it? I know it is a non-trivial and quite sensitive change, so if one needs more time to evaluate it, I certainly can wait more. Otherwise, it might be better to simply apply it on master and cope with the bugs. WDYT? Regards, -- Nicolas Goaziou
Re: [O] [RFC] Simple cache mechanism for `org-element-at-point'
Nicolas Goaziou n.goaz...@gmail.com writes: Nicolas Goaziou n.goaz...@gmail.com writes: Here is a slight change to the second one, which will correctly reset cache when some variables are customized or when a buffer is refreshed (C-c C-c on a keyword). By the way, almost a month has passed since the first message in this thread. Is someone still testing, or reviewing, it? I know it is a non-trivial and quite sensitive change, so if one needs more time to evaluate it, I certainly can wait more. Otherwise, it might be better to simply apply it on master and cope with the bugs. I wasn't expecting to report back anything meaningful on this issue unless I saw a bug or problem. I've done quite a bit of day-to-day using these patches -- editing, agenda stuff, and exporting -- with no noticeable ill effects... so there's my report! E
Re: [O] [RFC] Simple cache mechanism for `org-element-at-point'
Nicolas Goaziou n.goaz...@gmail.com writes: The following patches introduce a simple cache mechanism for both `org-element-at-point' and `org-element-context'. My goal is to make them fast enough to be used in most core commands (excepted headlines-only commands). Since a wrong cache can break Org behaviour badly, I would appreciate if it could be tested a bit. You can disable cache at any time by setting `org-element-use-cache' to nil and reset it with `org-element-cache-reset' function. It may also be interesting to tweak `org-element--cache-sync-idle-time' and `org-element--cache-merge-changes-threshold', although I don't expect a regular user to do it. Anyway, it may lead to better default values. Since cache is updated upon buffer modification, visibility status cannot be cached properly. Since it is also buggy, the first patch removes that data altogether. I applied the first patch. Here is a slight change to the second one, which will correctly reset cache when some variables are customized or when a buffer is refreshed (C-c C-c on a keyword). Regards, -- Nicolas Goaziou From 6fa0c2908c9cc3c768ec484ce9d7f87a971a4fa5 Mon Sep 17 00:00:00 2001 From: Nicolas Goaziou n.goaz...@gmail.com Date: Thu, 3 Oct 2013 22:12:35 +0200 Subject: [PATCH] org-element: Implement caching for dynamic parser * lisp/org-element.el (org-element-use-cache, org-element--cache, org-element--cache-sync-idle-time, org-element--cache-merge-changes-threshold, org-element--cache-status, org-element--cache-opening-line, org-element--cache-closing-line): New variables. (org-element-cache-reset, org-element--cache-pending-changes-p, org-element--cache-push-change, org-element--cache-cancel-changes, org-element--cache-get-key, org-element-cache-get, org-element-cache-put, org-element--shift-positions, org-element--cache-before-change, org-element--cache-record-change, org-element--cache-sync): New functions. (org-element-at-point, org-element-context): Use cache when possible. * lisp/org.el (org-mode, org-set-modules): Reset cache. * lisp/org-footnote.el (org-footnote-section): Reset cache. * lisp/org-src.el (org-src-preserve-indentation): Reset cache. * testing/lisp/test-org-element.el: Update tests. This patch gives a boost to `org-element-at-point' and, to a lesser extent, to `org-element-context'. --- lisp/org-element.el | 750 --- lisp/org-footnote.el | 9 +- lisp/org-src.el | 25 +- lisp/org.el | 6 +- testing/lisp/test-org-element.el | 18 +- 5 files changed, 658 insertions(+), 150 deletions(-) diff --git a/lisp/org-element.el b/lisp/org-element.el index 329d00a..cbe0e56 100644 --- a/lisp/org-element.el +++ b/lisp/org-element.el @@ -111,7 +111,8 @@ ;; ;; The library ends by furnishing `org-element-at-point' function, and ;; a way to give information about document structure around point -;; with `org-element-context'. +;; with `org-element-context'. A simple cache mechanism is also +;; provided for these functions. ;;; Code: @@ -4646,7 +4647,7 @@ indentation is not done with TAB characters. ;; The first move is to implement a way to obtain the smallest element ;; containing point. This is the job of `org-element-at-point'. It ;; basically jumps back to the beginning of section containing point -;; and moves, element after element, with +;; and proceed, one element after the other, with ;; `org-element--current-element' until the container is found. Note: ;; When using `org-element-at-point', secondary values are never ;; parsed since the function focuses on elements, not on objects. @@ -4654,8 +4655,417 @@ indentation is not done with TAB characters. ;; At a deeper level, `org-element-context' lists all elements and ;; objects containing point. ;; -;; `org-element-nested-p' and `org-element-swap-A-B' may be used -;; internally by navigation and manipulation tools. +;; Both functions benefit from a simple caching mechanism. It is +;; enabled by default, but can be disabled globally with +;; `org-element-use-cache'. Also `org-element-cache-reset' clears or +;; initializes cache for current buffer. Values are retrieved and put +;; into cache with respectively, `org-element-cache-get' and +;; `org-element-cache-put'. `org-element--cache-sync-idle-time' and +;; `org-element--cache-merge-changes-threshold' are used internally to +;; control caching behaviour. +;; +;; Eventually `org-element-nested-p' and `org-element-swap-A-B' may be +;; used internally by navigation and manipulation tools. + +(defvar org-element-use-cache t + Non nil when Org parser should cache its results.) + +(defvar org-element--cache nil + Hash table used as a cache for parser. +Key is a buffer position and value is a cons cell with the +pattern: + + \(ELEMENT . OBJECTS-DATA) + +where ELEMENT is the element starting at the key and OBJECTS-DATA +is an alist where each association is: + + \(POS CANDIDATES
Re: [O] [RFC] Simple cache mechanism for `org-element-at-point'
Hello, Eric Abrahamsen e...@ericabrahamsen.net writes: Cool! Anything in particular that we should be looking out for (structure editing, export, etc)? It has so far not set my computer on fire. Unfortunately, there is no simple recipe to try it out. Just use Org and, if you notice something suspicious, disable cache and try again. FYI, most sensitive cache operations happen when a headline, a block or a drawer in inserted, modifier or deleted. Thanks for testing it. Regards, -- Nicolas Goaziou
Re: [O] [RFC] Simple cache mechanism for `org-element-at-point'
Hi Nicolas, this sounds like a great idea. I have not yet had the time to test it - but I would like to bring forward two basic worries. Maybe you have comments on them? 1. Updating on buffer modification hooks sounds like a very demanding process. You basically add a third expensive process in addition to font locking and org-indent-mode. My worry is that this might be very heavy on Emacs and slow down fast workers. Again, I did not try it, just a worry 2. Do you expect this to be stable enough to deal with buffers that are invalid in some way or another? Are there any situations in which the parser could fail and leave some weird state behind? 3. Can you explain what you mean by except in headline-only commands? Thank you! - Carsten On 3.10.2013, at 23:18, Nicolas Goaziou n.goaz...@gmail.com wrote: Hello, The following patches introduce a simple cache mechanism for both `org-element-at-point' and `org-element-context'. My goal is to make them fast enough to be used in most core commands (excepted headlines-only commands). Since a wrong cache can break Org behaviour badly, I would appreciate if it could be tested a bit. You can disable cache at any time by setting `org-element-use-cache' to nil and reset it with `org-element-cache-reset' function. It may also be interesting to tweak `org-element--cache-sync-idle-time' and `org-element--cache-merge-changes-threshold', although I don't expect a regular user to do it. Anyway, it may lead to better default values. Since cache is updated upon buffer modification, visibility status cannot be cached properly. Since it is also buggy, the first patch removes that data altogether. Feedback welcome. Regards, -- Nicolas Goaziou 0001-org-element-Remove-folding-status-in-parsed-data.patch0002-org-element-Implement-caching-for-dynamic-parser.patch signature.asc Description: Message signed with OpenPGP using GPGMail
Re: [O] [RFC] Simple cache mechanism for `org-element-at-point'
Hello, Carsten Dominik carsten.domi...@gmail.com writes: 1. Updating on buffer modification hooks sounds like a very demanding process. There is obviously a cost, but it shouldn't be very high. I simplified the process in the announcement. Actually, the cache is not updated right after each buffer modification. What happens is the following: - After each buffer modification, a changeset is stored in a buffer-local variable. Building the changeset requires between 1 and 4 regexp searches between the boundaries of the change. - When Emacs is idle cache is updated according to that changeset (see `org-element--cache-sync-idle-time') and the changeset is erased. - If a modification happens while another previous changeset is still present, either changesets are merged into a single one (see `org-element--cache-merge-changes-threshold'), or, in the worst case, a cache sync is called in order to get rid of the old changeset, and the new one is stored. You basically add a third expensive process in addition to font locking and org-indent-mode. The plan is to use `org-element-at-point' for both of them, so all three will ultimately become only one process. My worry is that this might be very heavy on Emacs and slow down fast workers. Again, I did not try it, just a worry It obviously needs to be tested, but I would be surprised if it happened to be a problem, at least with a compiled Org (no clue on an uncompiled one). 2. Do you expect this to be stable enough to deal with buffers that are invalid in some way or another? Are there any situations in which the parser could fail and leave some weird state behind? There is nothing invalid at `org-element-at-point' level (i.e. it shouldn't error, ever). Invalid syntax means that what the parser sees doesn't match user's expectations. So there is, theoretically, no reason for the parser to fail. But there are bugs, and only testing will uncover them. 3. Can you explain what you mean by except in headline-only commands? `org-element-at-point' is meant to replace all `org-at-...'-like functions. Calling `org-element-at-point' is like calling all of them at the same time. It's more expensive than any of them, but returns more data and is always correct. But you don't need to know about context to tell if you're one a headline or not, so `org-at-heading-p' is almost always a superior choice (unless you need to also retrieve node properties). Likewise, if you only need to manipulate headlines, you don't need any context information. Regards, -- Nicolas Goaziou
Re: [O] [RFC] Simple cache mechanism for `org-element-at-point'
Nicolas Goaziou n.goaz...@gmail.com writes: Hello, The following patches introduce a simple cache mechanism for both `org-element-at-point' and `org-element-context'. My goal is to make them fast enough to be used in most core commands (excepted headlines-only commands). Since a wrong cache can break Org behaviour badly, I would appreciate if it could be tested a bit. You can disable cache at any time by setting `org-element-use-cache' to nil and reset it with `org-element-cache-reset' function. It may also be interesting to tweak `org-element--cache-sync-idle-time' and `org-element--cache-merge-changes-threshold', although I don't expect a regular user to do it. Anyway, it may lead to better default values. Since cache is updated upon buffer modification, visibility status cannot be cached properly. Since it is also buggy, the first patch removes that data altogether. Cool! Anything in particular that we should be looking out for (structure editing, export, etc)? It has so far not set my computer on fire.