Re: [O] [RFC] Simple cache mechanism for `org-element-at-point'

2013-11-03 Thread Nicolas Goaziou
Hello,

Eric Abrahamsen e...@ericabrahamsen.net writes:

 I wasn't expecting to report back anything meaningful on this issue
 unless I saw a bug or problem. I've done quite a bit of day-to-day using
 these patches -- editing, agenda stuff, and exporting -- with no
 noticeable ill effects... so there's my report!

Thank you very much for testing it. I pushed it on master branch.

Hopefully, the more functions use `org-element-at-point' and
`org-element-context', the more it will be beneficial.


Regards,

-- 
Nicolas Goaziou



Re: [O] [RFC] Simple cache mechanism for `org-element-at-point'

2013-10-30 Thread Nicolas Goaziou
Nicolas Goaziou n.goaz...@gmail.com writes:

 Here is a slight change to the second one, which will correctly reset
 cache when some variables are customized or when a buffer is refreshed
 (C-c C-c on a keyword).

By the way, almost a month has passed since the first message in this
thread. Is someone still testing, or reviewing, it? I know it is
a non-trivial and quite sensitive change, so if one needs more time to
evaluate it, I certainly can wait more.

Otherwise, it might be better to simply apply it on master and cope with
the bugs.

WDYT?


Regards,

-- 
Nicolas Goaziou



Re: [O] [RFC] Simple cache mechanism for `org-element-at-point'

2013-10-30 Thread Eric Abrahamsen
Nicolas Goaziou n.goaz...@gmail.com writes:

 Nicolas Goaziou n.goaz...@gmail.com writes:

 Here is a slight change to the second one, which will correctly reset
 cache when some variables are customized or when a buffer is refreshed
 (C-c C-c on a keyword).

 By the way, almost a month has passed since the first message in this
 thread. Is someone still testing, or reviewing, it? I know it is
 a non-trivial and quite sensitive change, so if one needs more time to
 evaluate it, I certainly can wait more.

 Otherwise, it might be better to simply apply it on master and cope with
 the bugs.

I wasn't expecting to report back anything meaningful on this issue
unless I saw a bug or problem. I've done quite a bit of day-to-day using
these patches -- editing, agenda stuff, and exporting -- with no
noticeable ill effects... so there's my report!

E




Re: [O] [RFC] Simple cache mechanism for `org-element-at-point'

2013-10-27 Thread Nicolas Goaziou
Nicolas Goaziou n.goaz...@gmail.com writes:

 The following patches introduce a simple cache mechanism for both
 `org-element-at-point' and `org-element-context'. My goal is to make
 them fast enough to be used in most core commands (excepted
 headlines-only commands).

 Since a wrong cache can break Org behaviour badly, I would appreciate if
 it could be tested a bit. You can disable cache at any time by setting
 `org-element-use-cache' to nil and reset it with
 `org-element-cache-reset' function.

 It may also be interesting to tweak `org-element--cache-sync-idle-time'
 and `org-element--cache-merge-changes-threshold', although I don't
 expect a regular user to do it. Anyway, it may lead to better default
 values.

 Since cache is updated upon buffer modification, visibility status
 cannot be cached properly. Since it is also buggy, the first patch
 removes that data altogether.

I applied the first patch.

Here is a slight change to the second one, which will correctly reset
cache when some variables are customized or when a buffer is refreshed
(C-c C-c on a keyword).


Regards,

-- 
Nicolas Goaziou
From 6fa0c2908c9cc3c768ec484ce9d7f87a971a4fa5 Mon Sep 17 00:00:00 2001
From: Nicolas Goaziou n.goaz...@gmail.com
Date: Thu, 3 Oct 2013 22:12:35 +0200
Subject: [PATCH] org-element: Implement caching for dynamic parser

* lisp/org-element.el (org-element-use-cache, org-element--cache,
org-element--cache-sync-idle-time,
org-element--cache-merge-changes-threshold, org-element--cache-status,
org-element--cache-opening-line, org-element--cache-closing-line): New
variables.
(org-element-cache-reset, org-element--cache-pending-changes-p,
org-element--cache-push-change, org-element--cache-cancel-changes,
org-element--cache-get-key, org-element-cache-get,
org-element-cache-put, org-element--shift-positions,
org-element--cache-before-change, org-element--cache-record-change,
org-element--cache-sync): New functions.
(org-element-at-point, org-element-context): Use cache when possible.
* lisp/org.el (org-mode, org-set-modules): Reset cache.
* lisp/org-footnote.el (org-footnote-section): Reset cache.
* lisp/org-src.el (org-src-preserve-indentation): Reset cache.
* testing/lisp/test-org-element.el: Update tests.

This patch gives a boost to `org-element-at-point' and, to a lesser
extent, to `org-element-context'.
---
 lisp/org-element.el  | 750 ---
 lisp/org-footnote.el |   9 +-
 lisp/org-src.el  |  25 +-
 lisp/org.el  |   6 +-
 testing/lisp/test-org-element.el |  18 +-
 5 files changed, 658 insertions(+), 150 deletions(-)

diff --git a/lisp/org-element.el b/lisp/org-element.el
index 329d00a..cbe0e56 100644
--- a/lisp/org-element.el
+++ b/lisp/org-element.el
@@ -111,7 +111,8 @@
 ;;
 ;; The library ends by furnishing `org-element-at-point' function, and
 ;; a way to give information about document structure around point
-;; with `org-element-context'.
+;; with `org-element-context'.  A simple cache mechanism is also
+;; provided for these functions.
 
 
 ;;; Code:
@@ -4646,7 +4647,7 @@ indentation is not done with TAB characters.
 ;; The first move is to implement a way to obtain the smallest element
 ;; containing point.  This is the job of `org-element-at-point'.  It
 ;; basically jumps back to the beginning of section containing point
-;; and moves, element after element, with
+;; and proceed, one element after the other, with
 ;; `org-element--current-element' until the container is found.  Note:
 ;; When using `org-element-at-point', secondary values are never
 ;; parsed since the function focuses on elements, not on objects.
@@ -4654,8 +4655,417 @@ indentation is not done with TAB characters.
 ;; At a deeper level, `org-element-context' lists all elements and
 ;; objects containing point.
 ;;
-;; `org-element-nested-p' and `org-element-swap-A-B' may be used
-;; internally by navigation and manipulation tools.
+;; Both functions benefit from a simple caching mechanism.  It is
+;; enabled by default, but can be disabled globally with
+;; `org-element-use-cache'.  Also `org-element-cache-reset' clears or
+;; initializes cache for current buffer.  Values are retrieved and put
+;; into cache with respectively, `org-element-cache-get' and
+;; `org-element-cache-put'.  `org-element--cache-sync-idle-time' and
+;; `org-element--cache-merge-changes-threshold' are used internally to
+;; control caching behaviour.
+;;
+;; Eventually `org-element-nested-p' and `org-element-swap-A-B' may be
+;; used internally by navigation and manipulation tools.
+
+(defvar org-element-use-cache t
+  Non nil when Org parser should cache its results.)
+
+(defvar org-element--cache nil
+  Hash table used as a cache for parser.
+Key is a buffer position and value is a cons cell with the
+pattern:
+
+  \(ELEMENT . OBJECTS-DATA)
+
+where ELEMENT is the element starting at the key and OBJECTS-DATA
+is an alist where each association is:
+
+  \(POS CANDIDATES 

Re: [O] [RFC] Simple cache mechanism for `org-element-at-point'

2013-10-04 Thread Nicolas Goaziou
Hello,

Eric Abrahamsen e...@ericabrahamsen.net writes:

 Cool! Anything in particular that we should be looking out for
 (structure editing, export, etc)? It has so far not set my computer on
 fire.

Unfortunately, there is no simple recipe to try it out. Just use Org
and, if you notice something suspicious, disable cache and try again.
FYI, most sensitive cache operations happen when a headline, a block or
a drawer in inserted, modifier or deleted.

Thanks for testing it.


Regards,

-- 
Nicolas Goaziou



Re: [O] [RFC] Simple cache mechanism for `org-element-at-point'

2013-10-04 Thread Carsten Dominik
Hi Nicolas,

this sounds like a great idea.  I have not yet had the time to
test it - but I would like to bring forward two basic worries.
Maybe you have comments on them?

1. Updating on buffer modification hooks sounds like a very
   demanding process.  You basically add a third expensive process
   in addition to font locking and org-indent-mode.  My worry is
   that this might be very heavy on Emacs and slow down fast workers.
   Again, I did not try it, just a worry

2. Do you expect this to be stable enough to deal with buffers that
   are invalid in some way or another?  Are there any situations in
   which the parser could fail and leave some weird state behind?

3. Can you explain what you mean by except in headline-only commands?

Thank you!

- Carsten

On 3.10.2013, at 23:18, Nicolas Goaziou n.goaz...@gmail.com wrote:

 Hello,
 
 The following patches introduce a simple cache mechanism for both
 `org-element-at-point' and `org-element-context'. My goal is to make
 them fast enough to be used in most core commands (excepted
 headlines-only commands).
 
 Since a wrong cache can break Org behaviour badly, I would appreciate if
 it could be tested a bit. You can disable cache at any time by setting
 `org-element-use-cache' to nil and reset it with
 `org-element-cache-reset' function.
 
 It may also be interesting to tweak `org-element--cache-sync-idle-time'
 and `org-element--cache-merge-changes-threshold', although I don't
 expect a regular user to do it. Anyway, it may lead to better default
 values.
 
 Since cache is updated upon buffer modification, visibility status
 cannot be cached properly. Since it is also buggy, the first patch
 removes that data altogether.
 
 Feedback welcome.
 
 
 Regards,
 
 -- 
 Nicolas Goaziou
 0001-org-element-Remove-folding-status-in-parsed-data.patch0002-org-element-Implement-caching-for-dynamic-parser.patch



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [O] [RFC] Simple cache mechanism for `org-element-at-point'

2013-10-04 Thread Nicolas Goaziou
Hello,

Carsten Dominik carsten.domi...@gmail.com writes:

 1. Updating on buffer modification hooks sounds like a very
demanding process.

There is obviously a cost, but it shouldn't be very high. I simplified
the process in the announcement. Actually, the cache is not updated
right after each buffer modification. What happens is the following:

  - After each buffer modification, a changeset is stored in
a buffer-local variable. Building the changeset requires between
1 and 4 regexp searches between the boundaries of the change.

  - When Emacs is idle cache is updated according to that changeset (see
`org-element--cache-sync-idle-time') and the changeset is erased.

  - If a modification happens while another previous changeset is still
present, either changesets are merged into a single one (see
`org-element--cache-merge-changes-threshold'), or, in the worst
case, a cache sync is called in order to get rid of the old
changeset, and the new one is stored.

You basically add a third expensive process in addition to font
locking and org-indent-mode.

The plan is to use `org-element-at-point' for both of them, so all three
will ultimately become only one process.

My worry is that this might be very heavy on Emacs and slow down
fast workers. Again, I did not try it, just a worry

It obviously needs to be tested, but I would be surprised if it happened
to be a problem, at least with a compiled Org (no clue on an uncompiled
one).

 2. Do you expect this to be stable enough to deal with buffers that
are invalid in some way or another? Are there any situations in which
the parser could fail and leave some weird state behind?

There is nothing invalid at `org-element-at-point' level (i.e. it
shouldn't error, ever). Invalid syntax means that what the parser sees
doesn't match user's expectations. So there is, theoretically, no reason
for the parser to fail. But there are bugs, and only testing will
uncover them.

 3. Can you explain what you mean by except in headline-only commands?

`org-element-at-point' is meant to replace all `org-at-...'-like
functions. Calling `org-element-at-point' is like calling all of them at
the same time. It's more expensive than any of them, but returns more
data and is always correct.

But you don't need to know about context to tell if you're one
a headline or not, so `org-at-heading-p' is almost always a superior
choice (unless you need to also retrieve node properties). Likewise, if
you only need to manipulate headlines, you don't need any context
information.


Regards,

-- 
Nicolas Goaziou



Re: [O] [RFC] Simple cache mechanism for `org-element-at-point'

2013-10-03 Thread Eric Abrahamsen
Nicolas Goaziou n.goaz...@gmail.com writes:

 Hello,

 The following patches introduce a simple cache mechanism for both
 `org-element-at-point' and `org-element-context'. My goal is to make
 them fast enough to be used in most core commands (excepted
 headlines-only commands).

 Since a wrong cache can break Org behaviour badly, I would appreciate if
 it could be tested a bit. You can disable cache at any time by setting
 `org-element-use-cache' to nil and reset it with
 `org-element-cache-reset' function.

 It may also be interesting to tweak `org-element--cache-sync-idle-time'
 and `org-element--cache-merge-changes-threshold', although I don't
 expect a regular user to do it. Anyway, it may lead to better default
 values.

 Since cache is updated upon buffer modification, visibility status
 cannot be cached properly. Since it is also buggy, the first patch
 removes that data altogether.

Cool! Anything in particular that we should be looking out for
(structure editing, export, etc)? It has so far not set my computer on
fire.