Re: [PATCH] ob-tangle.el: Speed up tangling

2021-05-01 Thread Bastien
Hi Sébastien,

Sébastien Miquel  writes:

> Indeed, I hadn't thought to run the tests, sorry. I've fixed my code
> and modified the `block-order` test in order for it to pass.

Applied with commit a2cb9b853, thank you very much for the updated patch.

> The patch does modify the order of the tangled blocks. When several
> blocks with different languages are tangled to the same file, they
> used to be grouped according to language, and are now tangled in the
> order in which they appear. I assumed this was an oversight in the
> previous code, but since this test exists, maybe it was intended ?
>
> Nicolas Goaziou wrote this test, perhaps he could comment on this.

Sure, feel free to adapt the code further if needed after discussing
this.

Thanks,

-- 
 Bastien



Re: [PATCH] ob-tangle.el: Speed up tangling

2021-05-01 Thread Sébastien Miquel

Hi Bastien,

Bastien writes:

The compiler is complaining with

   In toplevel form:
   ob-tangle.el:196:1: Warning: Variable ‘modes’ left uninitialized

Also, it breaks these two tests for me:

2 unexpected results:
FAILED  ob-tangle/block-order
FAILED  ob-tangle/continued-code-blocks-w-noweb-ref


Indeed, I hadn't thought to run the tests, sorry. I've fixed my code
and modified the `block-order` test in order for it to pass.

The patch does modify the order of the tangled blocks. When several
blocks with different languages are tangled to the same file, they
used to be grouped according to language, and are now tangled in the
order in which they appear. I assumed this was an oversight in the
previous code, but since this test exists, maybe it was intended ?

Nicolas Goaziou wrote this test, perhaps he could comment on this.

Regards,

--
Sébastien Miquel

>From 2aa09e8d2f4e8703190e9035d711508c11b3a8eb Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?S=C3=A9bastien=20Miquel?= 
Date: Sat, 1 May 2021 21:18:44 +0200
Subject: [PATCH] ob-tangle.el: Improve tangling

* lisp/ob-tangle.el (org-babel-tangle-collect-blocks): Group
collected blocks by tangled file name.
(org-babel-tangle): Avoid quadratic behavior in number of blocks and
set modes before writing to file.
* testing/lisp/test-ob-tangle.el (ob-tangle/block-order): Update test.
---
 lisp/ob-tangle.el  | 151 -
 testing/lisp/test-ob-tangle.el |   2 +-
 2 files changed, 74 insertions(+), 79 deletions(-)

diff --git a/lisp/ob-tangle.el b/lisp/ob-tangle.el
index 4c0c3132d..36144d6ae 100644
--- a/lisp/ob-tangle.el
+++ b/lisp/ob-tangle.el
@@ -225,67 +225,55 @@ matching a regular expression."
 	   (or (cdr (assq :tangle (nth 2 (org-babel-get-src-block-info 'light
 		   (user-error "Point is not in a source code block"
 	path-collector)
-	(mapc ;; map over all languages
-	 (lambda (by-lang)
-	   (let* ((lang (car by-lang))
-		  (specs (cdr by-lang))
-		  (ext (or (cdr (assoc lang org-babel-tangle-lang-exts)) lang))
-		  (lang-f (org-src-get-lang-mode lang))
-		  she-banged)
-	 (mapc
-	  (lambda (spec)
-		(let ((get-spec (lambda (name) (cdr (assoc name (nth 4 spec))
-		  (let* ((tangle (funcall get-spec :tangle))
-			 (she-bang (let ((sheb (funcall get-spec :shebang)))
- (when (> (length sheb) 0) sheb)))
-			 (tangle-mode (funcall get-spec :tangle-mode))
-			 (base-name (cond
- ((string= "yes" tangle)
-  (file-name-sans-extension
-   (nth 1 spec)))
- ((string= "no" tangle) nil)
- ((> (length tangle) 0) tangle)))
-			 (file-name (when base-name
-  ;; decide if we want to add ext to base-name
-  (if (and ext (string= "yes" tangle))
-	  (concat base-name "." ext) base-name
-		(when file-name
-		  ;; Possibly create the parent directories for file.
-		  (let ((m (funcall get-spec :mkdirp))
-			(fnd (file-name-directory file-name)))
-			(and m fnd (not (string= m "no"))
-			 (make-directory fnd 'parents)))
-		  ;; delete any old versions of file
-		  (and (file-exists-p file-name)
-			   (not (member file-name (mapcar #'car path-collector)))
-			   (delete-file file-name))
-		  ;; drop source-block to file
-		  (with-temp-buffer
-			(when (fboundp lang-f) (ignore-errors (funcall lang-f)))
-			(when (and she-bang (not (member file-name she-banged)))
+	(mapc ;; map over file-names
+	 (lambda (by-fn)
+	   (let ((file-name (car by-fn)))
+	 (when file-name
+   (let ((lspecs (cdr by-fn))
+		 (fnd (file-name-directory file-name))
+		 modes make-dir she-banged lang)
+	 ;; drop source-blocks to file
+	 ;; We avoid append-to-file as it does not work with tramp.
+	 (with-temp-buffer
+		   (mapc
+		(lambda (lspec)
+		  (let* ((block-lang (car lspec))
+			 (spec (cdr lspec))
+			 (get-spec (lambda (name) (cdr (assq name (nth 4 spec)
+			 (she-bang (let ((sheb (funcall get-spec :shebang)))
+ (when (> (length sheb) 0) sheb)))
+			 (tangle-mode (funcall get-spec :tangle-mode)))
+		(unless (string-equal block-lang lang)
+			  (setq lang block-lang)
+			  (let ((lang-f (org-src-get-lang-mode lang)))
+			(when (fboundp lang-f) (ignore-errors (funcall lang-f)
+		;; if file contains she-bangs, then make it executable
+		(when she-bang
+			  (unless tangle-mode (setq tangle-mode #o755)))
+		(when tangle-mode
+			  (add-to-list 'modes tangle-mode))
+		;; Possibly create the parent directories for file.
+		(let ((m (funcall get-spec :mkdirp)))
+			  (and m fnd (not (string= m "no"))
+			   (setq make-dir t)))
+		;; Handle :padlines unless first line in file
+		(unless (or (string= "no" (funcall get-spec :padline))
+(= (point) (point-min)))
+			  (insert "\n"))
+		(when (and she-bang (not she-banged))
 			  

Re: [PATCH] ob-tangle.el: Speed up tangling

2021-05-01 Thread Bastien
Hi Sébastien,

thanks for the patch!  I applied against master and tested it.

The compiler is complaining with

  In toplevel form:
  ob-tangle.el:196:1: Warning: Variable ‘modes’ left uninitialized

Also, it breaks these two tests for me:

2 unexpected results:  
   FAILED  ob-tangle/block-order   
   FAILED  ob-tangle/continued-code-blocks-w-noweb-ref

Let me know if you manage to pass all the test (or fix them...) and
silent the warning.

Thanks!

-- 
 Bastien



Re: [PATCH] ob-tangle.el: Speed up tangling

2021-04-21 Thread Timothy


Sébastien Miquel  writes:

> On second thought, I'm uneasy about my approach. If tangling fails,
> the user might miss the error message since it is quickly replaced by
> the tangling info. Ideally we should backup all the tangled files and
> restore them all if a single one fails to ensure we're back to a
> consistent state.
>
> I'm unsure what would be best practices here. In case of a remote
> tangled files, I don't know if temporary files should be remote or
> not, and what guarantees do emacs primitives such as ~rename-file~
> offer.

Just 2c from me on how I'd like this to work as a user, when tangling
fails:
+ Every file that could be tangled is tangled, or there's a variable
  which controls what to do on an error
+ Loud message at the end that lists all files which files failed to
  tangle

--
Timothy



Re: [PATCH] ob-tangle.el: Speed up tangling

2021-04-21 Thread Sébastien Miquel

Hi Tom,

Thank you again for your comments.

Tom Gillespie writes:

I think that the location of condition-case is ok, but I wonder what
would happen if something were to fail before entering that? I think
that only a subset of the files would be tangled, but they would all
have their correct modes, so I think that that is ok.

On second thought, I'm uneasy about my approach. If tangling fails,
the user might miss the error message since it is quickly replaced by
the tangling info. Ideally we should backup all the tangled files and
restore them all if a single one fails to ensure we're back to a
consistent state.

I'm unsure what would be best practices here. In case of a remote
tangled files, I don't know if temporary files should be remote or
not, and what guarantees do emacs primitives such as ~rename-file~
offer.

Although a robust tangling system that deals with errors and
guarantees that the state ends up consistent would be nice to have,
I'll take the failure considerations off this patch to keep it simple.
It'll make a better starting point for future work at least.

As is currently the case, if tangling fails, an error with be thrown,
the user will certainly notice and should assume that everything is
broken until another tangling succeeds.

I've kept the modes improvements.

Regards,

--
Sébastien Miquel

>From 6b123c956ac7abe0210cf7b1145ebe0a68f04713 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?S=C3=A9bastien=20Miquel?= 
Date: Sat, 17 Apr 2021 21:48:30 +0200
Subject: [PATCH] ob-tangle.el: Improve tangling

,* lisp/ob-tangle.el (org-babel-tangle-collect-blocks): Group
collected blocks by tangled file name.
(org-babel-tangle): Avoid quadratic behavior in number of blocks and
set modes before writing to file.
---
 lisp/ob-tangle.el | 151 ++
 1 file changed, 73 insertions(+), 78 deletions(-)

diff --git a/lisp/ob-tangle.el b/lisp/ob-tangle.el
index 4c0c3132d..8ca6b66fe 100644
--- a/lisp/ob-tangle.el
+++ b/lisp/ob-tangle.el
@@ -225,67 +225,55 @@ matching a regular expression."
 	   (or (cdr (assq :tangle (nth 2 (org-babel-get-src-block-info 'light
 		   (user-error "Point is not in a source code block"
 	path-collector)
-	(mapc ;; map over all languages
-	 (lambda (by-lang)
-	   (let* ((lang (car by-lang))
-		  (specs (cdr by-lang))
-		  (ext (or (cdr (assoc lang org-babel-tangle-lang-exts)) lang))
-		  (lang-f (org-src-get-lang-mode lang))
-		  she-banged)
-	 (mapc
-	  (lambda (spec)
-		(let ((get-spec (lambda (name) (cdr (assoc name (nth 4 spec))
-		  (let* ((tangle (funcall get-spec :tangle))
-			 (she-bang (let ((sheb (funcall get-spec :shebang)))
- (when (> (length sheb) 0) sheb)))
-			 (tangle-mode (funcall get-spec :tangle-mode))
-			 (base-name (cond
- ((string= "yes" tangle)
-  (file-name-sans-extension
-   (nth 1 spec)))
- ((string= "no" tangle) nil)
- ((> (length tangle) 0) tangle)))
-			 (file-name (when base-name
-  ;; decide if we want to add ext to base-name
-  (if (and ext (string= "yes" tangle))
-	  (concat base-name "." ext) base-name
-		(when file-name
-		  ;; Possibly create the parent directories for file.
-		  (let ((m (funcall get-spec :mkdirp))
-			(fnd (file-name-directory file-name)))
-			(and m fnd (not (string= m "no"))
-			 (make-directory fnd 'parents)))
-		  ;; delete any old versions of file
-		  (and (file-exists-p file-name)
-			   (not (member file-name (mapcar #'car path-collector)))
-			   (delete-file file-name))
-		  ;; drop source-block to file
-		  (with-temp-buffer
-			(when (fboundp lang-f) (ignore-errors (funcall lang-f)))
-			(when (and she-bang (not (member file-name she-banged)))
+	(mapc ;; map over file-names
+	 (lambda (by-fn)
+	   (let ((file-name (car by-fn)))
+	 (when file-name
+   (let ((lspecs (cdr by-fn))
+		 (fnd (file-name-directory file-name))
+		 modes make-dir she-banged lang)
+	 ;; drop source-blocks to file
+	 ;; We avoid append-to-file as it does not work with tramp.
+	 (with-temp-buffer
+		   (mapc
+		(lambda (lspec)
+		  (let* ((block-lang (car lspec))
+			 (spec (cdr lspec))
+			 (get-spec (lambda (name) (cdr (assq name (nth 4 spec)
+			 (she-bang (let ((sheb (funcall get-spec :shebang)))
+ (when (> (length sheb) 0) sheb)))
+			 (tangle-mode (funcall get-spec :tangle-mode)))
+		(unless (string-equal block-lang lang)
+			  (setq lang block-lang)
+			  (let ((lang-f (org-src-get-lang-mode lang)))
+			(when (fboundp lang-f) (ignore-errors (funcall lang-f)
+		;; if file contains she-bangs, then make it executable
+		(when she-bang
+			  (unless tangle-mode (setq tangle-mode #o755)))
+		(when tangle-mode
+			  (add-to-list modes tangle-mode))
+		;; Possibly create the parent directories for file.
+		  

Re: [PATCH] ob-tangle.el: Speed up tangling

2021-04-20 Thread Tom Gillespie
Hi Sébastien,
The temp -> rename approach is good, but you should probably use
make-temp-file to create the file to reduce the risk of
collisions/race conditions. For example as (make-temp-file (concat
file-name ".tangling")).

I think that the location of condition-case is ok, but I wonder what
would happen if something were to fail before entering that? I think
that only a subset of the files would be tangled, but they would all
have their correct modes, so I think that that is ok.

I also think that the message to the user should probably not be
changed right now. While it might can be useful for debug, if someone
is tangling to a large number of files then the filenames/paths are
going to flood messages, so I would leave it out of this patch, and
possibly submit it as another patch for a separate discussion.

Best!
Tom



Re: [PATCH] ob-tangle.el: Speed up tangling

2021-04-19 Thread Sébastien Miquel

Hi Tom,

Thank you for the comments.

Tom Gillespie writes:
> All of the issues that I'm aware of are related to what
> happens if tangling fails part way through the process.

That's not something I had considered. I wrote a new version of the
patch (attached) which addresses the insecure behaviour and the
possibility of failure. Please tell me what you think.

I've also
 + silenced the ~write-region~ messages, since I'm now writing to
   temporary files.
 + added the list of tangled files to the message to the user at the
   end of the tangling process.
 + replaced the use of ~when-let~.

Regards,

--
Sébastien Miquel
>From 82e4c1beade71194c90d377cdff7ef23532f4aa2 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?S=C3=A9bastien=20Miquel?= 
Date: Sat, 17 Apr 2021 21:48:30 +0200
Subject: [PATCH] ob-tangle.el: Improve tangling

,* lisp/ob-tangle.el (org-babel-tangle-collect-blocks): Group
collected blocks by tangled file name.
(org-babel-tangle): Avoid quadratic behavior in number of blocks.
Preserve original file in case of failure. Display the list of tangled
files at the end.
---
 lisp/ob-tangle.el | 167 --
 1 file changed, 87 insertions(+), 80 deletions(-)

diff --git a/lisp/ob-tangle.el b/lisp/ob-tangle.el
index 4c0c3132d..efafef5b8 100644
--- a/lisp/ob-tangle.el
+++ b/lisp/ob-tangle.el
@@ -225,87 +225,83 @@ matching a regular expression."
 	   (or (cdr (assq :tangle (nth 2 (org-babel-get-src-block-info 'light
 		   (user-error "Point is not in a source code block"
 	path-collector)
-	(mapc ;; map over all languages
-	 (lambda (by-lang)
-	   (let* ((lang (car by-lang))
-		  (specs (cdr by-lang))
-		  (ext (or (cdr (assoc lang org-babel-tangle-lang-exts)) lang))
-		  (lang-f (org-src-get-lang-mode lang))
-		  she-banged)
-	 (mapc
-	  (lambda (spec)
-		(let ((get-spec (lambda (name) (cdr (assoc name (nth 4 spec))
-		  (let* ((tangle (funcall get-spec :tangle))
-			 (she-bang (let ((sheb (funcall get-spec :shebang)))
- (when (> (length sheb) 0) sheb)))
-			 (tangle-mode (funcall get-spec :tangle-mode))
-			 (base-name (cond
- ((string= "yes" tangle)
-  (file-name-sans-extension
-   (nth 1 spec)))
- ((string= "no" tangle) nil)
- ((> (length tangle) 0) tangle)))
-			 (file-name (when base-name
-  ;; decide if we want to add ext to base-name
-  (if (and ext (string= "yes" tangle))
-	  (concat base-name "." ext) base-name
-		(when file-name
-		  ;; Possibly create the parent directories for file.
-		  (let ((m (funcall get-spec :mkdirp))
-			(fnd (file-name-directory file-name)))
-			(and m fnd (not (string= m "no"))
-			 (make-directory fnd 'parents)))
-		  ;; delete any old versions of file
-		  (and (file-exists-p file-name)
-			   (not (member file-name (mapcar #'car path-collector)))
-			   (delete-file file-name))
-		  ;; drop source-block to file
-		  (with-temp-buffer
-			(when (fboundp lang-f) (ignore-errors (funcall lang-f)))
-			(when (and she-bang (not (member file-name she-banged)))
+	(mapc ;; map over file-names
+	 (lambda (by-fn)
+	   (let ((file-name (car by-fn)))
+	 (when file-name
+   (let ((lspecs (cdr by-fn))
+		 (fnd (file-name-directory file-name))
+		 modes make-dir she-banged lang)
+	 ;; drop source-blocks to file
+	 ;; We avoid append-to-file as it does not work with tramp.
+	 (with-temp-buffer
+		   (mapc
+		(lambda (lspec)
+		  (let* ((block-lang (car lspec))
+			 (spec (cdr lspec))
+			 (get-spec (lambda (name) (cdr (assq name (nth 4 spec)
+			 (she-bang (let ((sheb (funcall get-spec :shebang)))
+ (when (> (length sheb) 0) sheb)))
+			 (tangle-mode (funcall get-spec :tangle-mode)))
+		(unless (string-equal block-lang lang)
+			  (setq lang block-lang)
+			  (let ((lang-f (org-src-get-lang-mode lang)))
+			(when (fboundp lang-f) (ignore-errors (funcall lang-f)
+		;; if file contains she-bangs, then make it executable
+		(when she-bang
+			  (unless tangle-mode (setq tangle-mode #o755)))
+		(when tangle-mode
+			  (add-to-list modes tangle-mode))
+		;; Possibly create the parent directories for file.
+		(let ((m (funcall get-spec :mkdirp)))
+			  (and m fnd (not (string= m "no"))
+			   (setq make-dir t)))
+		;; Handle :padlines unless first line in file
+		(unless (or (string= "no" (funcall get-spec :padline))
+(= (point) (point-min)))
+			  (insert "\n"))
+		(when (and she-bang (not she-banged))
 			  (insert (concat she-bang "\n"))
-			  (setq she-banged (cons file-name she-banged)))
-			(org-babel-spec-to-string spec)
-			;; We avoid append-to-file as it does not work with tramp.
-			(let ((content (buffer-string)))
-			  (with-temp-buffer
-			(when (file-exists-p file-name)
-			  

Re: [PATCH] ob-tangle.el: Speed up tangling

2021-04-18 Thread Tom Gillespie
Hi Sébastien,
   Some comments while looking over this (will report back when I have
tested it out as well). This is a section of the ob export
functionality that I have been looking for on and off for quite a
while because it is responsible for some bad and insecure behavior. I
think that some of your changes may have fixed/improved this as a side
effect. I don't know whether it is worth doing anything about the
issues in this patch, but since we are here, I think they are worth
mentioning. All of the issues that I'm aware of are related to what
happens if tangling fails part way through the process. First, your
patch already fixes a major issue which is that the modes of all files
would not be set if any one of them failed to tangle. Next, during the
process the existing file is deleted prior to tangling, which means
that it cannot be restored if tangling fails, it would be better if
the old file was moved to a temporary location and then deleted on
success or replaced on failure. This likely requires wrapping the bits
that can fail in unwind-protect and restoring on failure or fully
deleting at the end of success. The next issue is that setting the
tangle mode should happen before the file is written, an empty file
should be created, the mode should then be set, the contents of the
file should be written only after the mode has been set. This involves
a bit of reordering of operations in lines 124-126 of your patch. This
ordering of opertions prevents security issues related to race
conditions and potential errors being evoked during write-region
(though again, your changes already make the tangling code much more
secure by setting the modes on each file immediately after writing
instead of how it works currently where if any other block encounters
an error then no modes were set). Best!
Tom

On Sun, Apr 18, 2021 at 12:23 AM Sébastien Miquel
 wrote:
>
> Hi,
>
> The attached patch modifies the ~org-babel-tangle~ function to avoid a
> quadratic behavior in the number of blocks tangled to a single file.
>
> Tangling an org buffer with 200 blocks to 5 different files yields a
> 25 % speedup.
>
>
> * lisp/ob-tangle.el (org-babel-tangle-collect-blocks): Group
> collected blocks by tangled file name.
> (org-babel-tangle): Avoid quadratic behavior in number of blocks.
>
> --
> Sébastien Miquel