[Orgmode] org-scan-tags
In org-scan-tags, if todo-only is t, would it be possible to speed things up by changingthe regexp go to just the lines with a TODO keyword? I.e. in (let* ((re (concat "^" outline-regexp " *\\(\\<\\(" (mapconcat 'regexp-quote org-todo-keywords-1 "\\|") (org-re "\\>\\)\\)? *\\(.*?\\)\\(:[[:alnum:]_@:]+:\\)?[ \t]*$"))) remove the first "?" if todo-only is t. Also, regexp-opt might make a more efficient regexp than mapconcat with regexp-quote. Reason for request: I'm writing an extension of org for setting & checking goals, and want to quickly find entries with headlines of the form GOAL of which there may be relatively few in a large file. So, stepping through all entries and then checking them for the GOAL keyword is very inefficient. It would be much faster if the regexp included the GOAL as a keyword. It would be good if the parameter todo-only could be a list of strings, and org-scan-tags would return only the headlines where the todo keyword is from this list. It could use regexp-opt to make an efficient regexp for this. There also seem to be other opportunities for speeding up org-scan-tags in this way: e.g. if the match string includes +mytag, the regexp for the headline could include this as well. Similarly for properties. Maybe, org-make-tags-matcher could return a list of tags and properties that must appear in any matching entry. It would also help if the tags matcher expression could refer to text properties stored on the headline -- perhaps, with conditions such as :myprop=X (i.e. same as for org properties, but property name must be a keyword). It already does this for the 'org-category text property. Then one can e.g. mark entries representing unmet goals with text properties, and then use a regular org-tags-view to browse them in a sparsetree or an agenda. Thanks, ilya ___ Emacs-orgmode mailing list Please use `Reply All' to send replies to the list. Emacs-orgmode@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-orgmode
Re: [Orgmode] org-scan-tags
Hi Ilya, Ilya Shlyakhter writes: > In org-scan-tags, if todo-only is t, would it be possible to speed > things up by changingthe regexp go to just the lines with a TODO > keyword? > I.e. in > > (let* ((re (concat "^" outline-regexp " *\\(\\<\\(" > (mapconcat 'regexp-quote org-todo-keywords-1 "\\|") > (org-re > "\\>\\)\\)? *\\(.*?\\)\\(:[[:alnum:]_@:]+:\\)?[ > \t]*$"))) > > remove the first "?" if todo-only is t. Also, regexp-opt might make > a more efficient regexp than mapconcat with regexp-quote. I've optimized org-scan-tags a bit following your ideas (gaining ~12% according to elp) -- thanks for these directions. > It would be good if the parameter todo-only could be a list of > strings, and org-scan-tags would return only the headlines where the > todo keyword is from this list. This would be confusing. Particularily, org-tags-view uses org-scan-tags using both the todo-only argument and a matcher: so if you make the todo-only argument aware of TODO keywords, there might be some interference between todo-only and the matcher. I'd rather not go that route. Thanks, -- Bastien ___ Emacs-orgmode mailing list Please use `Reply All' to send replies to the list. Emacs-orgmode@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-orgmode
Re: [Orgmode] org-scan-tags
On Sep 15, 2010, at 5:19 AM, Ilya Shlyakhter wrote: In org-scan-tags, if todo-only is t, would it be possible to speed things up by changingthe regexp go to just the lines with a TODO keyword? I believe this may cause a problem. The scanner needs to see at least every parent node to be able to collect all inherited tags. So I think that a tree like * heading ** one :tag1: *** TODO two :tag2: would incorrectly miss out on :tag1: - Carsten I.e. in (let* ((re (concat "^" outline-regexp " *\\(\\<\\(" (mapconcat 'regexp-quote org-todo-keywords-1 "\ \|") (org-re "\\>\\)\\)? *\\(.*?\\)\\(:[[:alnum:]_@:]+:\\)? [ \t]*$"))) remove the first "?" if todo-only is t. Also, regexp-opt might make a more efficient regexp than mapconcat with regexp-quote. Reason for request: I'm writing an extension of org for setting & checking goals, and want to quickly find entries with headlines of the form GOAL of which there may be relatively few in a large file. So, stepping through all entries and then checking them for the GOAL keyword is very inefficient. It would be much faster if the regexp included the GOAL as a keyword. It would be good if the parameter todo-only could be a list of strings, and org-scan-tags would return only the headlines where the todo keyword is from this list. It could use regexp-opt to make an efficient regexp for this. There also seem to be other opportunities for speeding up org-scan-tags in this way: e.g. if the match string includes +mytag, the regexp for the headline could include this as well. Similarly for properties. Maybe, org-make-tags-matcher could return a list of tags and properties that must appear in any matching entry. It would also help if the tags matcher expression could refer to text properties stored on the headline -- perhaps, with conditions such as :myprop=X (i.e. same as for org properties, but property name must be a keyword). It already does this for the 'org-category text property. Then one can e.g. mark entries representing unmet goals with text properties, and then use a regular org-tags-view to browse them in a sparsetree or an agenda. Thanks, ilya ___ Emacs-orgmode mailing list Please use `Reply All' to send replies to the list. Emacs-orgmode@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-orgmode ___ Emacs-orgmode mailing list Please use `Reply All' to send replies to the list. Emacs-orgmode@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-orgmode
Re: [Orgmode] org-scan-tags
On Feb 3, 2011, at 6:32 AM, Carsten Dominik wrote: On Sep 15, 2010, at 5:19 AM, Ilya Shlyakhter wrote: In org-scan-tags, if todo-only is t, would it be possible to speed things up by changingthe regexp go to just the lines with a TODO keyword? I believe this may cause a problem. The scanner needs to see at least every parent node to be able to collect all inherited tags. So I think that a tree like * heading ** one :tag1: *** TODO two :tag2: would incorrectly miss out on :tag1: OK, here is an example where it really does fail: * heading ** one:tag1: *** two *** two:tag2: *** TODO two :tag2: *** two:tag2: Fold up the tree, then do C-c / m +tag1/! RET This should find the "TODO two", but it does not, because the new regexp moves right past the "one" line and so tag1 is overlooked. - Carsten - Carsten I.e. in (let* ((re (concat "^" outline-regexp " *\\(\\<\\(" (mapconcat 'regexp-quote org-todo-keywords-1 "\ \|") (org-re "\\>\\)\\)? *\\(.*?\\)\\(:[[:alnum:]_@:]+:\\)? [ \t]*$"))) remove the first "?" if todo-only is t. Also, regexp-opt might make a more efficient regexp than mapconcat with regexp-quote. Reason for request: I'm writing an extension of org for setting & checking goals, and want to quickly find entries with headlines of the form GOAL of which there may be relatively few in a large file. So, stepping through all entries and then checking them for the GOAL keyword is very inefficient. It would be much faster if the regexp included the GOAL as a keyword. It would be good if the parameter todo-only could be a list of strings, and org-scan-tags would return only the headlines where the todo keyword is from this list. It could use regexp-opt to make an efficient regexp for this. There also seem to be other opportunities for speeding up org-scan-tags in this way: e.g. if the match string includes +mytag, the regexp for the headline could include this as well. Similarly for properties. Maybe, org-make-tags-matcher could return a list of tags and properties that must appear in any matching entry. It would also help if the tags matcher expression could refer to text properties stored on the headline -- perhaps, with conditions such as :myprop=X (i.e. same as for org properties, but property name must be a keyword). It already does this for the 'org-category text property. Then one can e.g. mark entries representing unmet goals with text properties, and then use a regular org-tags-view to browse them in a sparsetree or an agenda. Thanks, ilya ___ Emacs-orgmode mailing list Please use `Reply All' to send replies to the list. Emacs-orgmode@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-orgmode ___ Emacs-orgmode mailing list Please use `Reply All' to send replies to the list. Emacs-orgmode@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-orgmode
Re: [Orgmode] org-scan-tags
Carsten Dominik writes: > OK, here is an example where it really does fail: > > > * heading > ** one:tag1: > *** two > *** two :tag2: > *** TODO two :tag2: > *** two :tag2: > > > Fold up the tree, then do > > C-c / m +tag1/! RET > > This should find the "TODO two", but it does not, because the > new regexp moves right past the "one" line and so tag1 is > overlooked. Right, thanks for the detailed example. I reverted the commit, it should be fine again. -- Bastien ___ Emacs-orgmode mailing list Please use `Reply All' to send replies to the list. Emacs-orgmode@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-orgmode
Re: [Orgmode] org-scan-tags
Thanks for the fast reaction, Bastioen! - Carsten On Feb 3, 2011, at 5:37 PM, Bastien wrote: Carsten Dominik writes: OK, here is an example where it really does fail: * heading ** one:tag1: *** two *** two:tag2: *** TODO two :tag2: *** two:tag2: Fold up the tree, then do C-c / m +tag1/! RET This should find the "TODO two", but it does not, because the new regexp moves right past the "one" line and so tag1 is overlooked. Right, thanks for the detailed example. I reverted the commit, it should be fine again. -- Bastien ___ Emacs-orgmode mailing list Please use `Reply All' to send replies to the list. Emacs-orgmode@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-orgmode
Re: [Orgmode] org-scan-tags
Thanks for catching this, Carsten! This could perhaps be fixed by doing a full lookup of the tags up the hierarchy, rather than relying on the cached tags. This is more expensive, but if fewer entries actually have to be looked at (because the search only stops at TODO entries), it might be faster overall. One general way to speed up searches would be to move as much work as possible into Emacs' built-in regexp matcher. When parsing a search expression, right now it is parsed into an elisp form that is evaluated at each entry and says whether the entry matches. Each clause of a search expression could instead be parsed into an elisp form _and_ a regexp, such that matching the regexp would be a necessary (but not sufficient) condition for the entry to match. E.g. if looking for entries with property PROP equal to 1, you could construct a regexp that would match only that. Some things aren't expressible in regexp language so they'd still have to be checked in lisp. And tag lookups could not use the cache. But if most of the filtering is done by Emacs' regexp matcher, and only a bit of lisp filtering on top of that, overall searches might be faster. On Thu, Feb 3, 2011 at 11:37 AM, Bastien wrote: > Carsten Dominik writes: > >> OK, here is an example where it really does fail: >> >> >> * heading >> ** one :tag1: >> *** two >> *** two :tag2: >> *** TODO two :tag2: >> *** two :tag2: >> >> >> Fold up the tree, then do >> >> C-c / m +tag1/! RET >> >> This should find the "TODO two", but it does not, because the >> new regexp moves right past the "one" line and so tag1 is >> overlooked. > > Right, thanks for the detailed example. I reverted the commit, > it should be fine again. > > -- > Bastien > ___ Emacs-orgmode mailing list Please use `Reply All' to send replies to the list. Emacs-orgmode@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-orgmode