Re: [PATCH] ob-tangle.el: Speed up tangling
Hi Sébastien, Sébastien Miquel writes: > Indeed, I hadn't thought to run the tests, sorry. I've fixed my code > and modified the `block-order` test in order for it to pass. Applied with commit a2cb9b853, thank you very much for the updated patch. > The patch does modify the order of the tangled blocks. When several > blocks with different languages are tangled to the same file, they > used to be grouped according to language, and are now tangled in the > order in which they appear. I assumed this was an oversight in the > previous code, but since this test exists, maybe it was intended ? > > Nicolas Goaziou wrote this test, perhaps he could comment on this. Sure, feel free to adapt the code further if needed after discussing this. Thanks, -- Bastien
Re: [PATCH] ob-tangle.el: Speed up tangling
Hi Bastien, Bastien writes: The compiler is complaining with In toplevel form: ob-tangle.el:196:1: Warning: Variable ‘modes’ left uninitialized Also, it breaks these two tests for me: 2 unexpected results: FAILED ob-tangle/block-order FAILED ob-tangle/continued-code-blocks-w-noweb-ref Indeed, I hadn't thought to run the tests, sorry. I've fixed my code and modified the `block-order` test in order for it to pass. The patch does modify the order of the tangled blocks. When several blocks with different languages are tangled to the same file, they used to be grouped according to language, and are now tangled in the order in which they appear. I assumed this was an oversight in the previous code, but since this test exists, maybe it was intended ? Nicolas Goaziou wrote this test, perhaps he could comment on this. Regards, -- Sébastien Miquel >From 2aa09e8d2f4e8703190e9035d711508c11b3a8eb Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?S=C3=A9bastien=20Miquel?= Date: Sat, 1 May 2021 21:18:44 +0200 Subject: [PATCH] ob-tangle.el: Improve tangling * lisp/ob-tangle.el (org-babel-tangle-collect-blocks): Group collected blocks by tangled file name. (org-babel-tangle): Avoid quadratic behavior in number of blocks and set modes before writing to file. * testing/lisp/test-ob-tangle.el (ob-tangle/block-order): Update test. --- lisp/ob-tangle.el | 151 - testing/lisp/test-ob-tangle.el | 2 +- 2 files changed, 74 insertions(+), 79 deletions(-) diff --git a/lisp/ob-tangle.el b/lisp/ob-tangle.el index 4c0c3132d..36144d6ae 100644 --- a/lisp/ob-tangle.el +++ b/lisp/ob-tangle.el @@ -225,67 +225,55 @@ matching a regular expression." (or (cdr (assq :tangle (nth 2 (org-babel-get-src-block-info 'light (user-error "Point is not in a source code block" path-collector) - (mapc ;; map over all languages - (lambda (by-lang) - (let* ((lang (car by-lang)) - (specs (cdr by-lang)) - (ext (or (cdr (assoc lang org-babel-tangle-lang-exts)) lang)) - (lang-f (org-src-get-lang-mode lang)) - she-banged) - (mapc - (lambda (spec) - (let ((get-spec (lambda (name) (cdr (assoc name (nth 4 spec)) - (let* ((tangle (funcall get-spec :tangle)) - (she-bang (let ((sheb (funcall get-spec :shebang))) - (when (> (length sheb) 0) sheb))) - (tangle-mode (funcall get-spec :tangle-mode)) - (base-name (cond - ((string= "yes" tangle) - (file-name-sans-extension - (nth 1 spec))) - ((string= "no" tangle) nil) - ((> (length tangle) 0) tangle))) - (file-name (when base-name - ;; decide if we want to add ext to base-name - (if (and ext (string= "yes" tangle)) - (concat base-name "." ext) base-name - (when file-name - ;; Possibly create the parent directories for file. - (let ((m (funcall get-spec :mkdirp)) - (fnd (file-name-directory file-name))) - (and m fnd (not (string= m "no")) - (make-directory fnd 'parents))) - ;; delete any old versions of file - (and (file-exists-p file-name) - (not (member file-name (mapcar #'car path-collector))) - (delete-file file-name)) - ;; drop source-block to file - (with-temp-buffer - (when (fboundp lang-f) (ignore-errors (funcall lang-f))) - (when (and she-bang (not (member file-name she-banged))) + (mapc ;; map over file-names + (lambda (by-fn) + (let ((file-name (car by-fn))) + (when file-name + (let ((lspecs (cdr by-fn)) + (fnd (file-name-directory file-name)) + modes make-dir she-banged lang) + ;; drop source-blocks to file + ;; We avoid append-to-file as it does not work with tramp. + (with-temp-buffer + (mapc + (lambda (lspec) + (let* ((block-lang (car lspec)) + (spec (cdr lspec)) + (get-spec (lambda (name) (cdr (assq name (nth 4 spec) + (she-bang (let ((sheb (funcall get-spec :shebang))) + (when (> (length sheb) 0) sheb))) + (tangle-mode (funcall get-spec :tangle-mode))) + (unless (string-equal block-lang lang) + (setq lang block-lang) + (let ((lang-f (org-src-get-lang-mode lang))) + (when (fboundp lang-f) (ignore-errors (funcall lang-f) + ;; if file contains she-bangs, then make it executable + (when she-bang + (unless tangle-mode (setq tangle-mode #o755))) + (when tangle-mode + (add-to-list 'modes tangle-mode)) + ;; Possibly create the parent directories for file. + (let ((m (funcall get-spec :mkdirp))) + (and m fnd (not (string= m "no")) + (setq make-dir t))) + ;; Handle :padlines unless first line in file + (unless (or (string= "no" (funcall get-spec :padline)) +(= (point) (point-min))) + (insert "\n")) + (when (and she-bang (not she-banged))
Re: [PATCH] ob-tangle.el: Speed up tangling
Hi Sébastien, thanks for the patch! I applied against master and tested it. The compiler is complaining with In toplevel form: ob-tangle.el:196:1: Warning: Variable ‘modes’ left uninitialized Also, it breaks these two tests for me: 2 unexpected results: FAILED ob-tangle/block-order FAILED ob-tangle/continued-code-blocks-w-noweb-ref Let me know if you manage to pass all the test (or fix them...) and silent the warning. Thanks! -- Bastien
Re: [PATCH] ob-tangle.el: Speed up tangling
Sébastien Miquel writes: > On second thought, I'm uneasy about my approach. If tangling fails, > the user might miss the error message since it is quickly replaced by > the tangling info. Ideally we should backup all the tangled files and > restore them all if a single one fails to ensure we're back to a > consistent state. > > I'm unsure what would be best practices here. In case of a remote > tangled files, I don't know if temporary files should be remote or > not, and what guarantees do emacs primitives such as ~rename-file~ > offer. Just 2c from me on how I'd like this to work as a user, when tangling fails: + Every file that could be tangled is tangled, or there's a variable which controls what to do on an error + Loud message at the end that lists all files which files failed to tangle -- Timothy
Re: [PATCH] ob-tangle.el: Speed up tangling
Hi Tom, Thank you again for your comments. Tom Gillespie writes: I think that the location of condition-case is ok, but I wonder what would happen if something were to fail before entering that? I think that only a subset of the files would be tangled, but they would all have their correct modes, so I think that that is ok. On second thought, I'm uneasy about my approach. If tangling fails, the user might miss the error message since it is quickly replaced by the tangling info. Ideally we should backup all the tangled files and restore them all if a single one fails to ensure we're back to a consistent state. I'm unsure what would be best practices here. In case of a remote tangled files, I don't know if temporary files should be remote or not, and what guarantees do emacs primitives such as ~rename-file~ offer. Although a robust tangling system that deals with errors and guarantees that the state ends up consistent would be nice to have, I'll take the failure considerations off this patch to keep it simple. It'll make a better starting point for future work at least. As is currently the case, if tangling fails, an error with be thrown, the user will certainly notice and should assume that everything is broken until another tangling succeeds. I've kept the modes improvements. Regards, -- Sébastien Miquel >From 6b123c956ac7abe0210cf7b1145ebe0a68f04713 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?S=C3=A9bastien=20Miquel?= Date: Sat, 17 Apr 2021 21:48:30 +0200 Subject: [PATCH] ob-tangle.el: Improve tangling ,* lisp/ob-tangle.el (org-babel-tangle-collect-blocks): Group collected blocks by tangled file name. (org-babel-tangle): Avoid quadratic behavior in number of blocks and set modes before writing to file. --- lisp/ob-tangle.el | 151 ++ 1 file changed, 73 insertions(+), 78 deletions(-) diff --git a/lisp/ob-tangle.el b/lisp/ob-tangle.el index 4c0c3132d..8ca6b66fe 100644 --- a/lisp/ob-tangle.el +++ b/lisp/ob-tangle.el @@ -225,67 +225,55 @@ matching a regular expression." (or (cdr (assq :tangle (nth 2 (org-babel-get-src-block-info 'light (user-error "Point is not in a source code block" path-collector) - (mapc ;; map over all languages - (lambda (by-lang) - (let* ((lang (car by-lang)) - (specs (cdr by-lang)) - (ext (or (cdr (assoc lang org-babel-tangle-lang-exts)) lang)) - (lang-f (org-src-get-lang-mode lang)) - she-banged) - (mapc - (lambda (spec) - (let ((get-spec (lambda (name) (cdr (assoc name (nth 4 spec)) - (let* ((tangle (funcall get-spec :tangle)) - (she-bang (let ((sheb (funcall get-spec :shebang))) - (when (> (length sheb) 0) sheb))) - (tangle-mode (funcall get-spec :tangle-mode)) - (base-name (cond - ((string= "yes" tangle) - (file-name-sans-extension - (nth 1 spec))) - ((string= "no" tangle) nil) - ((> (length tangle) 0) tangle))) - (file-name (when base-name - ;; decide if we want to add ext to base-name - (if (and ext (string= "yes" tangle)) - (concat base-name "." ext) base-name - (when file-name - ;; Possibly create the parent directories for file. - (let ((m (funcall get-spec :mkdirp)) - (fnd (file-name-directory file-name))) - (and m fnd (not (string= m "no")) - (make-directory fnd 'parents))) - ;; delete any old versions of file - (and (file-exists-p file-name) - (not (member file-name (mapcar #'car path-collector))) - (delete-file file-name)) - ;; drop source-block to file - (with-temp-buffer - (when (fboundp lang-f) (ignore-errors (funcall lang-f))) - (when (and she-bang (not (member file-name she-banged))) + (mapc ;; map over file-names + (lambda (by-fn) + (let ((file-name (car by-fn))) + (when file-name + (let ((lspecs (cdr by-fn)) + (fnd (file-name-directory file-name)) + modes make-dir she-banged lang) + ;; drop source-blocks to file + ;; We avoid append-to-file as it does not work with tramp. + (with-temp-buffer + (mapc + (lambda (lspec) + (let* ((block-lang (car lspec)) + (spec (cdr lspec)) + (get-spec (lambda (name) (cdr (assq name (nth 4 spec) + (she-bang (let ((sheb (funcall get-spec :shebang))) + (when (> (length sheb) 0) sheb))) + (tangle-mode (funcall get-spec :tangle-mode))) + (unless (string-equal block-lang lang) + (setq lang block-lang) + (let ((lang-f (org-src-get-lang-mode lang))) + (when (fboundp lang-f) (ignore-errors (funcall lang-f) + ;; if file contains she-bangs, then make it executable + (when she-bang + (unless tangle-mode (setq tangle-mode #o755))) + (when tangle-mode + (add-to-list modes tangle-mode)) + ;; Possibly create the parent directories for file. +
Re: [PATCH] ob-tangle.el: Speed up tangling
Hi Sébastien, The temp -> rename approach is good, but you should probably use make-temp-file to create the file to reduce the risk of collisions/race conditions. For example as (make-temp-file (concat file-name ".tangling")). I think that the location of condition-case is ok, but I wonder what would happen if something were to fail before entering that? I think that only a subset of the files would be tangled, but they would all have their correct modes, so I think that that is ok. I also think that the message to the user should probably not be changed right now. While it might can be useful for debug, if someone is tangling to a large number of files then the filenames/paths are going to flood messages, so I would leave it out of this patch, and possibly submit it as another patch for a separate discussion. Best! Tom
Re: [PATCH] ob-tangle.el: Speed up tangling
Hi Tom, Thank you for the comments. Tom Gillespie writes: > All of the issues that I'm aware of are related to what > happens if tangling fails part way through the process. That's not something I had considered. I wrote a new version of the patch (attached) which addresses the insecure behaviour and the possibility of failure. Please tell me what you think. I've also + silenced the ~write-region~ messages, since I'm now writing to temporary files. + added the list of tangled files to the message to the user at the end of the tangling process. + replaced the use of ~when-let~. Regards, -- Sébastien Miquel >From 82e4c1beade71194c90d377cdff7ef23532f4aa2 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?S=C3=A9bastien=20Miquel?= Date: Sat, 17 Apr 2021 21:48:30 +0200 Subject: [PATCH] ob-tangle.el: Improve tangling ,* lisp/ob-tangle.el (org-babel-tangle-collect-blocks): Group collected blocks by tangled file name. (org-babel-tangle): Avoid quadratic behavior in number of blocks. Preserve original file in case of failure. Display the list of tangled files at the end. --- lisp/ob-tangle.el | 167 -- 1 file changed, 87 insertions(+), 80 deletions(-) diff --git a/lisp/ob-tangle.el b/lisp/ob-tangle.el index 4c0c3132d..efafef5b8 100644 --- a/lisp/ob-tangle.el +++ b/lisp/ob-tangle.el @@ -225,87 +225,83 @@ matching a regular expression." (or (cdr (assq :tangle (nth 2 (org-babel-get-src-block-info 'light (user-error "Point is not in a source code block" path-collector) - (mapc ;; map over all languages - (lambda (by-lang) - (let* ((lang (car by-lang)) - (specs (cdr by-lang)) - (ext (or (cdr (assoc lang org-babel-tangle-lang-exts)) lang)) - (lang-f (org-src-get-lang-mode lang)) - she-banged) - (mapc - (lambda (spec) - (let ((get-spec (lambda (name) (cdr (assoc name (nth 4 spec)) - (let* ((tangle (funcall get-spec :tangle)) - (she-bang (let ((sheb (funcall get-spec :shebang))) - (when (> (length sheb) 0) sheb))) - (tangle-mode (funcall get-spec :tangle-mode)) - (base-name (cond - ((string= "yes" tangle) - (file-name-sans-extension - (nth 1 spec))) - ((string= "no" tangle) nil) - ((> (length tangle) 0) tangle))) - (file-name (when base-name - ;; decide if we want to add ext to base-name - (if (and ext (string= "yes" tangle)) - (concat base-name "." ext) base-name - (when file-name - ;; Possibly create the parent directories for file. - (let ((m (funcall get-spec :mkdirp)) - (fnd (file-name-directory file-name))) - (and m fnd (not (string= m "no")) - (make-directory fnd 'parents))) - ;; delete any old versions of file - (and (file-exists-p file-name) - (not (member file-name (mapcar #'car path-collector))) - (delete-file file-name)) - ;; drop source-block to file - (with-temp-buffer - (when (fboundp lang-f) (ignore-errors (funcall lang-f))) - (when (and she-bang (not (member file-name she-banged))) + (mapc ;; map over file-names + (lambda (by-fn) + (let ((file-name (car by-fn))) + (when file-name + (let ((lspecs (cdr by-fn)) + (fnd (file-name-directory file-name)) + modes make-dir she-banged lang) + ;; drop source-blocks to file + ;; We avoid append-to-file as it does not work with tramp. + (with-temp-buffer + (mapc + (lambda (lspec) + (let* ((block-lang (car lspec)) + (spec (cdr lspec)) + (get-spec (lambda (name) (cdr (assq name (nth 4 spec) + (she-bang (let ((sheb (funcall get-spec :shebang))) + (when (> (length sheb) 0) sheb))) + (tangle-mode (funcall get-spec :tangle-mode))) + (unless (string-equal block-lang lang) + (setq lang block-lang) + (let ((lang-f (org-src-get-lang-mode lang))) + (when (fboundp lang-f) (ignore-errors (funcall lang-f) + ;; if file contains she-bangs, then make it executable + (when she-bang + (unless tangle-mode (setq tangle-mode #o755))) + (when tangle-mode + (add-to-list modes tangle-mode)) + ;; Possibly create the parent directories for file. + (let ((m (funcall get-spec :mkdirp))) + (and m fnd (not (string= m "no")) + (setq make-dir t))) + ;; Handle :padlines unless first line in file + (unless (or (string= "no" (funcall get-spec :padline)) +(= (point) (point-min))) + (insert "\n")) + (when (and she-bang (not she-banged)) (insert (concat she-bang "\n")) - (setq she-banged (cons file-name she-banged))) - (org-babel-spec-to-string spec) - ;; We avoid append-to-file as it does not work with tramp. - (let ((content (buffer-string))) - (with-temp-buffer - (when (file-exists-p file-name) -
Re: [PATCH] ob-tangle.el: Speed up tangling
Hi Sébastien, Some comments while looking over this (will report back when I have tested it out as well). This is a section of the ob export functionality that I have been looking for on and off for quite a while because it is responsible for some bad and insecure behavior. I think that some of your changes may have fixed/improved this as a side effect. I don't know whether it is worth doing anything about the issues in this patch, but since we are here, I think they are worth mentioning. All of the issues that I'm aware of are related to what happens if tangling fails part way through the process. First, your patch already fixes a major issue which is that the modes of all files would not be set if any one of them failed to tangle. Next, during the process the existing file is deleted prior to tangling, which means that it cannot be restored if tangling fails, it would be better if the old file was moved to a temporary location and then deleted on success or replaced on failure. This likely requires wrapping the bits that can fail in unwind-protect and restoring on failure or fully deleting at the end of success. The next issue is that setting the tangle mode should happen before the file is written, an empty file should be created, the mode should then be set, the contents of the file should be written only after the mode has been set. This involves a bit of reordering of operations in lines 124-126 of your patch. This ordering of opertions prevents security issues related to race conditions and potential errors being evoked during write-region (though again, your changes already make the tangling code much more secure by setting the modes on each file immediately after writing instead of how it works currently where if any other block encounters an error then no modes were set). Best! Tom On Sun, Apr 18, 2021 at 12:23 AM Sébastien Miquel wrote: > > Hi, > > The attached patch modifies the ~org-babel-tangle~ function to avoid a > quadratic behavior in the number of blocks tangled to a single file. > > Tangling an org buffer with 200 blocks to 5 different files yields a > 25 % speedup. > > > * lisp/ob-tangle.el (org-babel-tangle-collect-blocks): Group > collected blocks by tangled file name. > (org-babel-tangle): Avoid quadratic behavior in number of blocks. > > -- > Sébastien Miquel