Hi Eric, I’m jointly replying to 2 of your emails.
2013ko martxoak 13an, Eric Schulte-ek idatzi zuen: > This is what is already taking place. The :var header arguments are > automatically expanded into dependencies between code blocks, and the > results of previous code blocks are included in the hash calculation of > the current code block. Wow, I did not realize that the :var handling was so sophisticated. Would it be possible to introduce a :depends code-block-name header argument, which recycles the same dependency calculation but does not bind a variable in the code block? Many of the variables that I rely on between blocks are large data frames, and I worry that dumping them into the org buffer and then reloading them into R[fn:1] will result in a slowdown and/or loss of some structure in the data. [fn:1] My understanding of the :var-handling code is that this is how it acquires the values to assign to the variables, as opposed to re-using a variable that is present in a session. But the code is complex, so maybe I’m wrong (again). I also think this would make the feature more discoverable: a :var is just a sub-type of :depends, with extra functionality. Coming from a Sweave/knitr background, I expected something like :depends, and thus didn’t notice :var > > From re-looking at Achim's previous noweb example, it seems that we > currently do *not* include the values of noweb expansions in code block > hash calculations, I think this is a bug which should be fixed. +1 > To echo Achim's response, you've accidentally uttered Org-mode heresy. Oh no. The good news is that thanks to your and Achim’s explanation, I think I now understand this principle better. >>> Oh yes, there's a whole set of _other_ problems that are waiting to be >>> solved. :-) >> >> There always is. :-) > > I think Org-mode already provides the bulk of what is desired. If we > agree to treat ":cache yes :results none" as obviously taking place for > side effects, and then sticking a hash behind the :cache header argument > with the code block, then what functionality would be missing? This was more of a joke on my part: life gets boring when you run out of problems to work on. In this specific case, though: 1) a :depends header argument 2) including the session PID in results hashes by default (because it is the only sensible thing to do) 2013ko martxoak 13an, Eric Schulte-ek idatzi zuen: > Well, I suppose one man's dirty kludge is another's beautiful hack. The > question here is whether the complexity lies in the implementation (and > thus the interface) or in the code block itself. While I generally > prefer the later, in this case of ":results none :cache yes" I would be > open to placing some custom logic in the backend, which stores the hash > value with the code block, possibly changing > > #+begin_src R :cache yes > # code to perform side effect > #+end_src > > to > > #+begin_src R :cache 9f4e5b4b07e93c680ab37fc4ba1f75e1bfc0ee0a > # code to perform side effect > #+end_src > > keeping in mind that the actual hash value should be hidden after the > first couple of characters. If you like this solution, may I try once more to convince you of the empty #+RESULTS solution I originally proposed? I looked at the code for inserting/hiding/finding hash values, and it looks like it would be complicated to change. Empty #+RESULTS is easy, although perhaps less conceptually pure. If you want the cache in the header, I think I can try to work on a patch, but it does look tricky. So I am not sure I will have time to work on it until next week. (If anyone else wants to beat me to the punch, please feel free!) One question: should we have the cache in the header only for :results none blocks, or for all blocks? > I was actually very proud of this solution. It is what would be done by > the framework if we did implement custom support, but by doing it with > code blocks the exact mechanics are visible to the user. Agreed. But if it is the only “right” thing to do, or one of a very small set of “right” things, I think it’s a win in terms of conciseness/ease of use to do it automatically. And I think this is the case here: the presence of :session yes is a clear signal that there is out-of-band (from org’s perspective) communication happening between code blocks, and that the invariance of a result can’t be relied on in a different session process. So when the session PID changes, the hash value should change as well, to trigger reevaluation. > How should session startup be determined if not through inclusion of the > session PID in the code block hash? Perhaps the above could be made > more elegant through the addition of an elisp function which returns the > pid of the current R session, allowing the above to be truncated to > something like the following. > > #+begin_src R :cache yes :session foo :var pid=(R-pid "foo") > # code to perform side effect > x <- 'side effect' > 'done' > #+end_src > > I don't suppose ESS provides such a function? You can get the value with (process-id (get-process ess-current-process-name)), which you have to evaluate in the current session buffer (the one that C-c C-v C-z takes you to). Generally speaking, I guess each ob-foo should provide a function to retrieve this value, since it will be different for different languages. -- Aaron Ecay