Re: [O] How to make agenda generation faster

2018-10-20 Thread Nicolas Goaziou
Hello,

Adam Porter  writes:

> Yes, because this is the fastest way to search for matching entries in a
> buffer, when it's possible to use a regexp search.

You would still do regexp searches, but not at the time of queries.

> That would be ideal. The problem I foresee is that, when a buffer's cache
> is not up-to-date, and the user runs an agenda query, the user will have to
> wait for the buffer to be parsed and cached, which is much slower than a
> regexp search through the buffer.

No, because filling cache is still a regexp search.

> That was what I first tried with org-agenda-ng: I parsed the whole buffer
> with org-element and ran predicates against the element tree.

Org Element is not needed, and even shouldn't be used, to retrieve most
agenda related data.

There are exceptions of course, mainly plain timestamps and clocks. This
is where the current agenda is hard to beat, because 1. it cheats and
includes timestamps without checking context, 2. it only searches for
timestamps related to the day being displayed in the agenda view. The
last point makes it particularly fast for single day views.

> Another idea I've had, similar to yours, would be to pre-process buffers,
> adding metadata as text-properties on heading lines. However, I haven't
> tested it, and I don't know what the performance would be like. And it
> would still suffer from the caching problem I mentioned.

It is still a way to cache stuff. The difficulty here is to keep data
up-to-date with changes. Storing per-node cache could be nice,
nevertheless.

> I think the fundamental problems are 1) keeping the cache in sync with the
> raw buffer,

Yes, whole buffer caching is simpler here: drop all cached data if
buffer contents differ from the cached one. That's what I did in may
last attempt to speed up agenda, comparing md5sums. It works reasonably
well.

I also cached per agenda data type (schedules, deadlines, clocks…) but
that means you know something about the query. I think querying and
searching should be separated should it shouldn't be done.

> and 2) the slow speed of parsing an entire buffer's metadata at
> once (depending on the size of the files, of course, but mine are big
> enough to be slow, and I'm sure many users have larger ones).

I think this could be solved by fetching data preemptively during idle
time. I would also work well with per-node caching, since you can
interrupt fetching easily.


Regards,

-- 
Nicolas Goaziou



Re: [O] How to make agenda generation faster

2018-10-19 Thread Adam Porter
On Oct 18, 2018 5:48 PM, "Nicolas Goaziou"  wrote:

> Are you saying that queries are turned into regexp searches within Org
files? If so, I don't think they should.

Yes, because this is the fastest way to search for matching entries in a
buffer, when it's possible to use a regexp search.

> Queries should only operate on the output of the data extraction,
possibly a list of defstructs. I.e., you first extract all meaningful data
from the document (during idle time, with cache, or whatever optimization
would be chosen), store it in an appropriate format, then query it.
>
> WDYT?

That would be ideal. The problem I foresee is that, when a buffer's cache
is not up-to-date, and the user runs an agenda query, the user will have to
wait for the buffer to be parsed and cached, which is much slower than a
regexp search through the buffer.

That was what I first tried with org-agenda-ng: I parsed the whole buffer
with org-element and ran predicates against the element tree.  It was much
too slow to be practical, so I switched to the current approach, which runs
predicates against each node, only checking the necessary metadata. It's
fast enough to be useful, but can still be slow in some cases, and I don't
think it would be fast enough as a replacement for the current agenda
code.  But with further optimization, like using whole-buffer regexp
searches when possible, it might be.

Another idea I've had, similar to yours, would be to pre-process buffers,
adding metadata as text-properties on heading lines. However, I haven't
tested it, and I don't know what the performance would be like. And it
would still suffer from the caching problem I mentioned.

I think the fundamental problems are 1) keeping the cache in sync with the
raw buffer, and 2) the slow speed of parsing an entire buffer's metadata at
once (depending on the size of the files, of course, but mine are big
enough to be slow, and I'm sure many users have larger ones).

Of course, maybe someone cleverer than me can figure out a clever solution
to these problems. :)


Re: [O] How to make agenda generation faster

2018-10-18 Thread stardiviner


>> However, before it could be suitable as a possible replacement, it will
>> likely require more optimization.  Some queries, especially more complex
>> ones, are slower than the equivalent searches and agendas in the current
>> Org Agenda code.  This is because of the way the queries run predicates
>> on each heading.  Despite the current Org Agenda code's complexity, it
>> is well optimized and hard to beat.
>
> Are you saying that queries are turned into regexp searches within Org
> files? If so, I don't think they should.
>
> Queries should only operate on the output of the data extraction,
> possibly a list of defstructs. I.e., you first extract all meaningful
> data from the document (during idle time, with cache, or whatever
> optimization would be chosen), store it in an appropriate format, then
> query it.
>

I think the same way. In some language library like Clojure's enlive
handle the HTML string the same way.

--
[ stardiviner ] don't need to convince with trends.
   Blog: https://stardiviner.github.io/
   IRC(freenode): stardiviner
   GPG: F09F650D7D674819892591401B5DF1C95AE89AC3



Re: [O] How to make agenda generation faster

2018-10-18 Thread Nicolas Goaziou
Hello,

Adam Porter  writes:

> Org is welcome to take any of the org-ql or org-ql-agenda code you think
> would be useful.

Thank you.

> However, before it could be suitable as a possible replacement, it will
> likely require more optimization.  Some queries, especially more complex
> ones, are slower than the equivalent searches and agendas in the current
> Org Agenda code.  This is because of the way the queries run predicates
> on each heading.  Despite the current Org Agenda code's complexity, it
> is well optimized and hard to beat.

Are you saying that queries are turned into regexp searches within Org
files? If so, I don't think they should. 

Queries should only operate on the output of the data extraction,
possibly a list of defstructs. I.e., you first extract all meaningful
data from the document (during idle time, with cache, or whatever
optimization would be chosen), store it in an appropriate format, then
query it.

WDYT?

Regards,

-- 
Nicolas Goaziou



Re: [O] How to make agenda generation faster

2018-10-17 Thread Adam Porter
Nicolas Goaziou  writes:

> It can, but AFAIK, it doesn't yet. It also means un-optimized lexical
> binding may be slightly slower than dynamic scoping for the time
> being.

Well, I can't vouch for it myself, because I haven't studied the code.
But here's one of the resources that suggests it is faster to use
lexical binding:

https://emacs.stackexchange.com/questions/2129/why-is-let-faster-with-lexical-scope

> That's not exactly what I'm suggesting. I suggest to move the work in
> Org tree, e.g., as an org-agenda-ng.el library, and, from there,
> implement back most of the features of the current agenda.
>
> Org cannot really benefit from libraries living outside Emacs, as we
> recently learnt with htmlize issue.

Org is welcome to take any of the org-ql or org-ql-agenda code you think
would be useful.

However, before it could be suitable as a possible replacement, it will
likely require more optimization.  Some queries, especially more complex
ones, are slower than the equivalent searches and agendas in the current
Org Agenda code.  This is because of the way the queries run predicates
on each heading.  Despite the current Org Agenda code's complexity, it
is well optimized and hard to beat.

I have a proof-of-concept branch that begins to implement a relatively
simple optimization that converts one suitable predicate in a query to a
buffer-global regexp search.  It significantly improves speed in some
cases, but a query with several predicates still has to run all but one
of them as predicates.

Another possible optimization would be to convert as many predicates in
a query to buffer regexp searches as possible, collecting a list of
heading positions in the buffer, and then do a final pass with the
appropriate union/intersection/difference operations on the lists.  Then
the list of positions could be used to gather the heading data.  I use a
similar technique in helm-org-rifle, and it seems to work quickly.  It
would require some work on a sort of "query compiler" to do the
transformation and optimization.  I don't have much experience with that
kind of programming; maybe someone else would be interested in helping
with that.

So before taking any of the code into Org itself, you might want to
consider these issues and decide whether it could be a suitable
approach.  Let me know what you'd like to do and how I can help.

Thanks,
Adam




Re: [O] How to make agenda generation faster

2018-10-17 Thread Nicolas Goaziou
Hello,

Adam Porter  writes:

> From what I've read, the byte-compiler can optimize better when
> lexical-binding is used.

It can, but AFAIK, it doesn't yet. It also means un-optimized lexical
binding may be slightly slower than dynamic scoping for the time being.

> I've thought about this for a while.  It seems to me that the issue is
> that Org buffers are, of course, plain-text buffers.  There is no
> persistent, in-memory representation other than the buffer, so whenever
> Org needs structured/semantic data, it must parse it out of the buffer,
> which is necessarily rather slow.  If there were a way to keep an
> outline tree in memory, parallel to the buffer itself, that would allow
> operations like search, agenda, etc. to be greatly sped up.

I don't think that's necessary. File caching as you suggest below, can
go a long way. Filling cache during idle time, too.

> But how would that work in Emacs?  Theoretically, we could write some
> code, applied on self-insert-command, to update the "parallel tree
> structure" as the user manipulates the plain-text in the buffer
> (e.g. add a new node when the user types a "*" to create a new heading),
> and also apply it to functions that manipulate the outline structurally
> in the buffer.  But, of course, that sounds very complicated.  I would
> not relish the idea of debugging code to keep a cached tree in sync with
> a plain-text buffer outline.  :)

My over-engineering-o-meter flashes red, too.

> Anyway, org-ql tries to do some of what you mentioned.  It does
> rudimentary, per-buffer, per-query caching (as long as the buffer is not
> modified, the cache remains valid), which helps when there are several
> Org files open that are referred to often but not as often modified.

That's what I did in an agenda upgrade I tried a few months ago.
Unfortunately, caching is not compatible with the underlying logic of
current Agenda, in particular with `org-agenda-skip-function'.

> And the query and presentation code are separated (org-ql and
> org-ql-agenda).

That's a very good thing.

> I don't know how widely it's used, but the repo is getting some regular
> traffic, and I'm using it as the backend for my org-sidebar package.
> I'd be happy if it could be made more generally useful, or if it could
> be helpful to Org itself in some way.  Contributions are welcome.

That's not exactly what I'm suggesting. I suggest to move the work in
Org tree, e.g., as an org-agenda-ng.el library, and, from there,
implement back most of the features of the current agenda. 

Org cannot really benefit from libraries living outside Emacs, as we
recently learnt with htmlize issue.

Regards,

-- 
Nicolas Goaziou



Re: [O] How to make agenda generation faster

2018-10-17 Thread Ihor Radchenko

>   I've thought about this for a while.  It seems to me that the issue is
>   that Org buffers are, of course, plain-text buffers.  There is no
>   persistent, in-memory representation other than the buffer, so whenever
>   Org needs structured/semantic data, it must parse it out of the buffer,
>   which is necessarily rather slow.  If there were a way to keep an
>   outline tree in memory, parallel to the buffer itself, that would allow
>   operations like search, agenda, etc. to be greatly sped up.

FYI
A while ago I saw some cache implementation in org-element.el.
Take a look at org-element--cache variable definition and the code
below.


(defvar org-element--cache nil
  "AVL tree used to cache elements.
Each node of the tree contains an element.  Comparison is done
with `org-element--cache-compare'.  This cache is used in
`org-element-at-point'.")


Best,
Ihor


Adam Porter  writes:

> Nicolas Goaziou  writes:
>
>>> my understanding is that code that runs with lexical-binding enabled
>>> is generally faster.
>>
>> Not really. But it's certainly easier to understand since it removes one
>> class of problems.
>
> From what I've read, the byte-compiler can optimize better when
> lexical-binding is used.
>
>> Instead of re-inventing the wheel, or putting efforts into a
>> wheel-like invention, wouldn't it make sense to actually work on Org
>> Agenda itself?
>>
>> So again, wouldn't it be nice to think about Org Agenda-ng?
>
> As a matter of fact, what's now called org-ql-agenda was originally
> called org-agenda-ng.  I factored org-ql out of it and realized that it
> should probably be its own, standalone package.  Then I renamed
> org-agenda-ng to org-ql-agenda, so I could reasonably keep them in the
> same repo, and because I don't know if I will ever develop it far enough
> to be worthy of the name org-agenda-ng.  It started as an experiment to
> build a foundation for a new, modular agenda implementation, and maybe
> it could be.
>
>> I didn't look closely at org-ql, but I had the idea of splitting the
>> Agenda in two distinct parts. One would be responsible for collecting,
>> possibly asynchronously, and caching data from Org documents. The other
>> one would provide a DSL to query and display the results extracted from
>> the output of the first part. The second part could even be made generic
>> enough to be extracted from Org and become some part of Emacs.
>> Displaying filtered data, maybe in a timeline, could be useful for other
>> packages. Unfortunately, I don't have time to work on this. Ah well.
>
> I've thought about this for a while.  It seems to me that the issue is
> that Org buffers are, of course, plain-text buffers.  There is no
> persistent, in-memory representation other than the buffer, so whenever
> Org needs structured/semantic data, it must parse it out of the buffer,
> which is necessarily rather slow.  If there were a way to keep an
> outline tree in memory, parallel to the buffer itself, that would allow
> operations like search, agenda, etc. to be greatly sped up.
>
> But how would that work in Emacs?  Theoretically, we could write some
> code, applied on self-insert-command, to update the "parallel tree
> structure" as the user manipulates the plain-text in the buffer
> (e.g. add a new node when the user types a "*" to create a new heading),
> and also apply it to functions that manipulate the outline structurally
> in the buffer.  But, of course, that sounds very complicated.  I would
> not relish the idea of debugging code to keep a cached tree in sync with
> a plain-text buffer outline.  :)
>
> Besides that, AFAIK there would be no way to do it asynchronously other
> than calling out to a child Emacs process (because elisp is still
> single-threaded), printing and reading the data back and forth (which
> would tie up the parent process when reading).  Maybe in the future
> elisp will be multithreaded...
>
> Anyway, org-ql tries to do some of what you mentioned.  It does
> rudimentary, per-buffer, per-query caching (as long as the buffer is not
> modified, the cache remains valid), which helps when there are several
> Org files open that are referred to often but not as often modified.
> And the query and presentation code are separated (org-ql and
> org-ql-agenda).
>
> I don't know how widely it's used, but the repo is getting some regular
> traffic, and I'm using it as the backend for my org-sidebar package.
> I'd be happy if it could be made more generally useful, or if it could
> be helpful to Org itself in some way.  Contributions are welcome.
>
>


signature.asc
Description: PGP signature


Re: [O] How to make agenda generation faster

2018-10-16 Thread Adam Porter
Nicolas Goaziou  writes:

>> my understanding is that code that runs with lexical-binding enabled
>> is generally faster.
>
> Not really. But it's certainly easier to understand since it removes one
> class of problems.

>From what I've read, the byte-compiler can optimize better when
lexical-binding is used.

> Instead of re-inventing the wheel, or putting efforts into a
> wheel-like invention, wouldn't it make sense to actually work on Org
> Agenda itself?
>
> So again, wouldn't it be nice to think about Org Agenda-ng?

As a matter of fact, what's now called org-ql-agenda was originally
called org-agenda-ng.  I factored org-ql out of it and realized that it
should probably be its own, standalone package.  Then I renamed
org-agenda-ng to org-ql-agenda, so I could reasonably keep them in the
same repo, and because I don't know if I will ever develop it far enough
to be worthy of the name org-agenda-ng.  It started as an experiment to
build a foundation for a new, modular agenda implementation, and maybe
it could be.

> I didn't look closely at org-ql, but I had the idea of splitting the
> Agenda in two distinct parts. One would be responsible for collecting,
> possibly asynchronously, and caching data from Org documents. The other
> one would provide a DSL to query and display the results extracted from
> the output of the first part. The second part could even be made generic
> enough to be extracted from Org and become some part of Emacs.
> Displaying filtered data, maybe in a timeline, could be useful for other
> packages. Unfortunately, I don't have time to work on this. Ah well.

I've thought about this for a while.  It seems to me that the issue is
that Org buffers are, of course, plain-text buffers.  There is no
persistent, in-memory representation other than the buffer, so whenever
Org needs structured/semantic data, it must parse it out of the buffer,
which is necessarily rather slow.  If there were a way to keep an
outline tree in memory, parallel to the buffer itself, that would allow
operations like search, agenda, etc. to be greatly sped up.

But how would that work in Emacs?  Theoretically, we could write some
code, applied on self-insert-command, to update the "parallel tree
structure" as the user manipulates the plain-text in the buffer
(e.g. add a new node when the user types a "*" to create a new heading),
and also apply it to functions that manipulate the outline structurally
in the buffer.  But, of course, that sounds very complicated.  I would
not relish the idea of debugging code to keep a cached tree in sync with
a plain-text buffer outline.  :)

Besides that, AFAIK there would be no way to do it asynchronously other
than calling out to a child Emacs process (because elisp is still
single-threaded), printing and reading the data back and forth (which
would tie up the parent process when reading).  Maybe in the future
elisp will be multithreaded...

Anyway, org-ql tries to do some of what you mentioned.  It does
rudimentary, per-buffer, per-query caching (as long as the buffer is not
modified, the cache remains valid), which helps when there are several
Org files open that are referred to often but not as often modified.
And the query and presentation code are separated (org-ql and
org-ql-agenda).

I don't know how widely it's used, but the repo is getting some regular
traffic, and I'm using it as the backend for my org-sidebar package.
I'd be happy if it could be made more generally useful, or if it could
be helpful to Org itself in some way.  Contributions are welcome.




Re: [O] How to make agenda generation faster

2018-10-14 Thread Marcin Borkowski


On 2018-10-11, at 21:59, Samuel Wales  wrote:

> i too visit all files when emacs starts.
>
> are we saying that the speed depends on the number of headlines total
> or the number of headlines in a single file among the agenda files?

Probably the former...?

>
> On 10/11/18, Marcin Borkowski  wrote:
>>
>> On 2018-10-11, at 08:48, Michael Welle  wrote:
>>
>>> Hello,
>>>
>>> Marcin Borkowski  writes:
>>>
 On 2018-10-08, at 09:20, Michael Welle  wrote:
>>> [...]
> Well, on my laptop the initial agenda run takes about 7s or so (150
> agenda files) using the current day/week agenda ("a"). All subsequent
> (after loading the files) agenda runs are fast (split second I would
> say). I had some performance issues in the past caused by SCM. Emacs
> tried to check if every file is checked out in the latest version. That
> slowed down the process a lot (starting 150 mercurial processes in
> sequential order, checking results, etc.). The initial run doesn't
> bother me much. I bound the initial agenda run to an idle timer at
> Emacs
> start.

 Interesting.  I did not notice such differences between the first and
 subsequent runs.
>>> I thought that behaviour is natural, scanning dirs for files and opening
>>> them is a costly operation. But a week ago I changed from rotating rust
>>> to solid state disks and that behaviour did not change much. I expected
>>> a speed up, but mee.
>>
>> Ah, I have /visiting/ all my agenda files (but not generating the agenda
>> itself) in my init.el.
>>
>> That explains a lot.
>>
>> Best,
>>
>> --
>> Marcin Borkowski
>> http://mbork.pl
>>
>>


-- 
Marcin Borkowski
http://mbork.pl



Re: [O] How to make agenda generation faster

2018-10-14 Thread Marcin Borkowski


On 2018-10-11, at 08:40, Michael Welle  wrote:

> Hello,
>
> Marcin Borkowski  writes:
>
>> On 2018-10-09, at 13:47, Julius Dittmar  wrote:
>>
>>> Hi Marcin,
>>>
>>> I can't advise as to profiling to find out what really bogs down agenda
>>> building.
>>>
>>> I found that log messages do bog it down.
>>>
>>> I have a lot of recurring tasks, which accumulate log entries for every
>>> closing (which in fact means rescheduling to the next day). Every two to
>>> three months I prune my org files of those log entries. This
>>> significantly speeds up agenda building.
>>
>> By experiments, I found that the main bottleneck was a file with lots (=
>> a few thousand) headlines.
> ah, interesting. My org files usually aren't that deeply structured, so
> I don't get hit by that. Hm, I guess regexps are used to find headlines?

Mine were very flat - I had *lots* of captured links to websites.

Best,

-- 
Marcin Borkowski
http://mbork.pl



Re: [O] How to make agenda generation faster

2018-10-11 Thread Samuel Wales
i too visit all files when emacs starts.

are we saying that the speed depends on the number of headlines total
or the number of headlines in a single file among the agenda files?

On 10/11/18, Marcin Borkowski  wrote:
>
> On 2018-10-11, at 08:48, Michael Welle  wrote:
>
>> Hello,
>>
>> Marcin Borkowski  writes:
>>
>>> On 2018-10-08, at 09:20, Michael Welle  wrote:
>> [...]
 Well, on my laptop the initial agenda run takes about 7s or so (150
 agenda files) using the current day/week agenda ("a"). All subsequent
 (after loading the files) agenda runs are fast (split second I would
 say). I had some performance issues in the past caused by SCM. Emacs
 tried to check if every file is checked out in the latest version. That
 slowed down the process a lot (starting 150 mercurial processes in
 sequential order, checking results, etc.). The initial run doesn't
 bother me much. I bound the initial agenda run to an idle timer at
 Emacs
 start.
>>>
>>> Interesting.  I did not notice such differences between the first and
>>> subsequent runs.
>> I thought that behaviour is natural, scanning dirs for files and opening
>> them is a costly operation. But a week ago I changed from rotating rust
>> to solid state disks and that behaviour did not change much. I expected
>> a speed up, but mee.
>
> Ah, I have /visiting/ all my agenda files (but not generating the agenda
> itself) in my init.el.
>
> That explains a lot.
>
> Best,
>
> --
> Marcin Borkowski
> http://mbork.pl
>
>


-- 
The Kafka Pandemic: 

The disease DOES progress. MANY people have died from it. And ANYBODY
can get it at any time.

"You’ve really gotta quit this and get moving, because this is murder
by neglect." ---
.



Re: [O] How to make agenda generation faster

2018-10-11 Thread Marcin Borkowski


On 2018-10-11, at 08:48, Michael Welle  wrote:

> Hello,
>
> Marcin Borkowski  writes:
>
>> On 2018-10-08, at 09:20, Michael Welle  wrote:
> [...]
>>> Well, on my laptop the initial agenda run takes about 7s or so (150
>>> agenda files) using the current day/week agenda ("a"). All subsequent
>>> (after loading the files) agenda runs are fast (split second I would
>>> say). I had some performance issues in the past caused by SCM. Emacs
>>> tried to check if every file is checked out in the latest version. That
>>> slowed down the process a lot (starting 150 mercurial processes in
>>> sequential order, checking results, etc.). The initial run doesn't
>>> bother me much. I bound the initial agenda run to an idle timer at Emacs
>>> start.
>>
>> Interesting.  I did not notice such differences between the first and
>> subsequent runs.
> I thought that behaviour is natural, scanning dirs for files and opening
> them is a costly operation. But a week ago I changed from rotating rust
> to solid state disks and that behaviour did not change much. I expected
> a speed up, but mee.

Ah, I have /visiting/ all my agenda files (but not generating the agenda
itself) in my init.el.

That explains a lot.

Best,

--
Marcin Borkowski
http://mbork.pl



Re: [O] How to make agenda generation faster

2018-10-10 Thread Michael Welle
Hello,

Marcin Borkowski  writes:

> On 2018-10-08, at 09:20, Michael Welle  wrote:
[...]
>> Well, on my laptop the initial agenda run takes about 7s or so (150
>> agenda files) using the current day/week agenda ("a"). All subsequent
>> (after loading the files) agenda runs are fast (split second I would
>> say). I had some performance issues in the past caused by SCM. Emacs
>> tried to check if every file is checked out in the latest version. That
>> slowed down the process a lot (starting 150 mercurial processes in
>> sequential order, checking results, etc.). The initial run doesn't
>> bother me much. I bound the initial agenda run to an idle timer at Emacs
>> start. 
>
> Interesting.  I did not notice such differences between the first and
> subsequent runs.
I thought that behaviour is natural, scanning dirs for files and opening
them is a costly operation. But a week ago I changed from rotating rust
to solid state disks and that behaviour did not change much. I expected
a speed up, but mee. 

Regards
hmw



Re: [O] How to make agenda generation faster

2018-10-10 Thread Michael Welle
Hello,

Marcin Borkowski  writes:

> On 2018-10-09, at 13:47, Julius Dittmar  wrote:
>
>> Hi Marcin,
>>
>> I can't advise as to profiling to find out what really bogs down agenda
>> building.
>>
>> I found that log messages do bog it down.
>>
>> I have a lot of recurring tasks, which accumulate log entries for every
>> closing (which in fact means rescheduling to the next day). Every two to
>> three months I prune my org files of those log entries. This
>> significantly speeds up agenda building.
>
> By experiments, I found that the main bottleneck was a file with lots (=
> a few thousand) headlines.
ah, interesting. My org files usually aren't that deeply structured, so
I don't get hit by that. Hm, I guess regexps are used to find headlines?

Regards
hmw



Re: [O] How to make agenda generation faster

2018-10-10 Thread Samuel Wales
for cleaning logbook entries, i'd enjoy having an agenda view that
shows every entry that has state changes [above a minimum number of
them to keep it small], with the size of the logbook drawer in the
prefix or so next to the category, sorted by that size.

there would be a corresponding agenda batch command that would
archive, delete, or archive all except most recent for the marked
entries.

is it the number of headlines in a file or the total number in agenda files?

i think it's great to have org-ql.  lispy query is great.  although
mostly i just use text search, it would be more memorizable syntax for
tags type search [and custom sorts?].  is this a suitable start for
agenda-ng?  will it be cleaner and faster?

another speedup possibility might be to allow redoing the agenda with
a new sorting strategy without having to redo the scanning of agenda
files.

i agree not scanning unchanged buffers could really speed up the
agenda in principle. [it'd be great if emacs could parallelize across
smp cores in addition.  :]]


On 10/10/18, Marcin Borkowski  wrote:
>
> On 2018-10-08, at 09:20, Michael Welle  wrote:
>
>> Hello,
>>
>> Marcin Borkowski  writes:
>>
>>> Hi Orgers,
>>>
>>> my agenda takes almost 10 seconds to show up.  Are there any ideas for
>>> profiling that?
>>>
>>> I suspect that archiving a lot of old entries I don't use anymore might
>>> help, but is there any way to e.g. display some stats on which
>>> file/headline took how much time?
>> since no one answered yet, there are some similar threads. IIRC the way
>> to go is to use elp for profiling.
>>
>> Well, on my laptop the initial agenda run takes about 7s or so (150
>> agenda files) using the current day/week agenda ("a"). All subsequent
>> (after loading the files) agenda runs are fast (split second I would
>> say). I had some performance issues in the past caused by SCM. Emacs
>> tried to check if every file is checked out in the latest version. That
>> slowed down the process a lot (starting 150 mercurial processes in
>> sequential order, checking results, etc.). The initial run doesn't
>> bother me much. I bound the initial agenda run to an idle timer at Emacs
>> start.
>
> Interesting.  I did not notice such differences between the first and
> subsequent runs.
>
> Anyway, thanks for your input (to all people who replied, actually).
>
> --
> Marcin Borkowski
> http://mbork.pl
>
>


-- 
The Kafka Pandemic: 

The disease DOES progress. MANY people have died from it. And ANYBODY
can get it at any time.

"You’ve really gotta quit this and get moving, because this is murder
by neglect." ---
.



Re: [O] How to make agenda generation faster

2018-10-10 Thread Marcin Borkowski


On 2018-10-08, at 09:20, Michael Welle  wrote:

> Hello,
>
> Marcin Borkowski  writes:
>
>> Hi Orgers,
>>
>> my agenda takes almost 10 seconds to show up.  Are there any ideas for
>> profiling that?
>>
>> I suspect that archiving a lot of old entries I don't use anymore might
>> help, but is there any way to e.g. display some stats on which
>> file/headline took how much time?
> since no one answered yet, there are some similar threads. IIRC the way
> to go is to use elp for profiling.
>
> Well, on my laptop the initial agenda run takes about 7s or so (150
> agenda files) using the current day/week agenda ("a"). All subsequent
> (after loading the files) agenda runs are fast (split second I would
> say). I had some performance issues in the past caused by SCM. Emacs
> tried to check if every file is checked out in the latest version. That
> slowed down the process a lot (starting 150 mercurial processes in
> sequential order, checking results, etc.). The initial run doesn't
> bother me much. I bound the initial agenda run to an idle timer at Emacs
> start. 

Interesting.  I did not notice such differences between the first and
subsequent runs.

Anyway, thanks for your input (to all people who replied, actually).

-- 
Marcin Borkowski
http://mbork.pl



Re: [O] How to make agenda generation faster

2018-10-10 Thread Marcin Borkowski


On 2018-10-09, at 13:47, Julius Dittmar  wrote:

> Hi Marcin,
>
> I can't advise as to profiling to find out what really bogs down agenda
> building.
>
> I found that log messages do bog it down.
>
> I have a lot of recurring tasks, which accumulate log entries for every
> closing (which in fact means rescheduling to the next day). Every two to
> three months I prune my org files of those log entries. This
> significantly speeds up agenda building.

By experiments, I found that the main bottleneck was a file with lots (=
a few thousand) headlines.

Best,

-- 
Marcin Borkowski
http://mbork.pl



Re: [O] How to make agenda generation faster

2018-10-10 Thread Marcin Borkowski


On 2018-10-09, at 18:11, Nicolas Goaziou  wrote:

> Hello,
>
> Adam Porter  writes:
>
>> My feedback is: there be dragons.  ;)  The Agenda code is very
>> complicated and hard to follow, and it's hard to optimize something that
>> is hard to understand.
>
> And hard to maintain. We should really do something about it.
>
>> In the long run, to get significant speed improvements, I think it may
>> be necessary to reimplement the Agenda.
>
> Agreed.

+1

> [...]
>
> I didn't look closely at org-ql, but I had the idea of splitting the
> Agenda in two distinct parts. One would be responsible for collecting,
> possibly asynchronously, and caching data from Org documents. The other
> one would provide a DSL to query and display the results extracted from
> the output of the first part. The second part could even be made generic
> enough to be extracted from Org and become some part of Emacs.
> Displaying filtered data, maybe in a timeline, could be useful for other
> packages. Unfortunately, I don't have time to work on this. Ah well.
>
> So again, wouldn't it be nice to think about Org Agenda-ng?

That is a great idea!  In general, I find Org-mode to be lacking APIs.
I'dlove to build some applications on top of it, but getting some
information is very difficult.  (For instance, I'd like to get info
about clocks for all headlines in the agenda.  It seems I have to
implement parsing clocks myself, at least partially.)

Best,

-- 
Marcin Borkowski
http://mbork.pl



Re: [O] How to make agenda generation faster

2018-10-10 Thread Marcin Borkowski


On 2018-10-09, at 08:37, Adam Porter  wrote:

> Hi Marcin,
>
> [...]
>
> If you haven't seen them already, you may find my org-ql and
> org-ql-agenda code useful.  org-ql-agenda presents an Agenda-like
> buffer.  N.B. It does *not* implement most of the Agenda features, but
> it does emulate an Org Agenda buffer by setting the appropriate text
> properties on entries and formatting them in a similar way.
>
> It's built on org-ql, which provides per-buffer query caching, which
> means that generating an org-ql-agenda view for Org buffers that haven't
> changed since the last view was generated is very fast.  It's also
> written in a more functional way, which I think is easier to follow and
> modify.  Performance of uncached queries/buffers depends on the
> query--some are relatively fast, while others are slower than the "real"
> Org Agenda.  I think there is significant potential for optimizations,
> and I'm hoping to implement some in the future.  Your feedback would be
> appreciated!
>
> https://github.com/alphapapa/org-ql

Thanks, I'll check those out!

Best,

-- 
Marcin Borkowski
http://mbork.pl



Re: [O] How to make agenda generation faster

2018-10-09 Thread Nicolas Goaziou
Hello,

Adam Porter  writes:

> My feedback is: there be dragons.  ;)  The Agenda code is very
> complicated and hard to follow, and it's hard to optimize something that
> is hard to understand.

And hard to maintain. We should really do something about it.

> In the long run, to get significant speed improvements, I think it may
> be necessary to reimplement the Agenda.

Agreed.

> However, due to the nature of it (i.e. regexp searches through buffers
> to find entries), I don't know how much faster it can be made. I don't
> mean that I doubt it can be--I mean that, truly, I don't know, because
> it's hard to understand the flow of the code.
>
> I think that it is already fairly well optimized, given its limitations.
> However, an example of a potential improvement would be to refactor it
> to work with lexical-binding enabled (which didn't exist when it was
> first created); I can't say how much of an improvement it would make,
> but my understanding is that code that runs with lexical-binding enabled
> is generally faster.

Not really. But it's certainly easier to understand since it removes one
class of problems.

> But doing that would be a non-trivial project, I
> think, requiring the fixing of many inevitable regressions in the
> process.
>
> If you haven't seen them already, you may find my org-ql and
> org-ql-agenda code useful.  org-ql-agenda presents an Agenda-like
> buffer.  N.B. It does *not* implement most of the Agenda features, but
> it does emulate an Org Agenda buffer by setting the appropriate text
> properties on entries and formatting them in a similar way.

Instead of re-inventing the wheel, or putting efforts into a wheel-like
invention, wouldn't it make sense to actually work on Org Agenda itself?

I didn't look closely at org-ql, but I had the idea of splitting the
Agenda in two distinct parts. One would be responsible for collecting,
possibly asynchronously, and caching data from Org documents. The other
one would provide a DSL to query and display the results extracted from
the output of the first part. The second part could even be made generic
enough to be extracted from Org and become some part of Emacs.
Displaying filtered data, maybe in a timeline, could be useful for other
packages. Unfortunately, I don't have time to work on this. Ah well.

So again, wouldn't it be nice to think about Org Agenda-ng?

Regards,

-- 
Nicolas Goaziou



Re: [O] How to make agenda generation faster

2018-10-09 Thread Julius Dittmar
Hi Marcin,

I can't advise as to profiling to find out what really bogs down agenda
building.

I found that log messages do bog it down.

I have a lot of recurring tasks, which accumulate log entries for every
closing (which in fact means rescheduling to the next day). Every two to
three months I prune my org files of those log entries. This
significantly speeds up agenda building.

HTH,
Julius




Re: [O] How to make agenda generation faster

2018-10-08 Thread Adam Porter
Hi Marcin,

My feedback is: there be dragons.  ;)  The Agenda code is very
complicated and hard to follow, and it's hard to optimize something that
is hard to understand.

In the long run, to get significant speed improvements, I think it may
be necessary to reimplement the Agenda.  However, due to the nature of
it (i.e. regexp searches through buffers to find entries), I don't know
how much faster it can be made.  I don't mean that I doubt it can be--I
mean that, truly, I don't know, because it's hard to understand the flow
of the code.

I think that it is already fairly well optimized, given its limitations.
However, an example of a potential improvement would be to refactor it
to work with lexical-binding enabled (which didn't exist when it was
first created); I can't say how much of an improvement it would make,
but my understanding is that code that runs with lexical-binding enabled
is generally faster.  But doing that would be a non-trivial project, I
think, requiring the fixing of many inevitable regressions in the
process.

If you haven't seen them already, you may find my org-ql and
org-ql-agenda code useful.  org-ql-agenda presents an Agenda-like
buffer.  N.B. It does *not* implement most of the Agenda features, but
it does emulate an Org Agenda buffer by setting the appropriate text
properties on entries and formatting them in a similar way.

It's built on org-ql, which provides per-buffer query caching, which
means that generating an org-ql-agenda view for Org buffers that haven't
changed since the last view was generated is very fast.  It's also
written in a more functional way, which I think is easier to follow and
modify.  Performance of uncached queries/buffers depends on the
query--some are relatively fast, while others are slower than the "real"
Org Agenda.  I think there is significant potential for optimizations,
and I'm hoping to implement some in the future.  Your feedback would be
appreciated!

https://github.com/alphapapa/org-ql




Re: [O] How to make agenda generation faster

2018-10-08 Thread Michael Welle
Hello,

Marcin Borkowski  writes:

> Hi Orgers,
>
> my agenda takes almost 10 seconds to show up.  Are there any ideas for
> profiling that?
>
> I suspect that archiving a lot of old entries I don't use anymore might
> help, but is there any way to e.g. display some stats on which
> file/headline took how much time?
since no one answered yet, there are some similar threads. IIRC the way
to go is to use elp for profiling.

Well, on my laptop the initial agenda run takes about 7s or so (150
agenda files) using the current day/week agenda ("a"). All subsequent
(after loading the files) agenda runs are fast (split second I would
say). I had some performance issues in the past caused by SCM. Emacs
tried to check if every file is checked out in the latest version. That
slowed down the process a lot (starting 150 mercurial processes in
sequential order, checking results, etc.). The initial run doesn't
bother me much. I bound the initial agenda run to an idle timer at Emacs
start. 

Regards
hmw



[O] How to make agenda generation faster

2018-10-06 Thread Marcin Borkowski
Hi Orgers,

my agenda takes almost 10 seconds to show up.  Are there any ideas for
profiling that?

I suspect that archiving a lot of old entries I don't use anymore might
help, but is there any way to e.g. display some stats on which
file/headline took how much time?

TIA,

-- 
Marcin Borkowski
http://mbork.pl