Re: [R] History pruning
We have different starting points. Please be sure that your modularity allows a cleaned region as well as a history log to be the input to your next step. The history log is incomplete; lines sent to the *R* buffer by C-c C-n are explicitly excluded from history. Lines picked up from a saved transcript file aren't in history. Will your program handle this correctly: aa - bb + cc ? It is valid code. Suppressing the line with cc will make it not valid code. Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] History pruning
On 8/8/08 1:04 PM, Greg Snow [EMAIL PROTECTED] wrote: Ken, Others have given hints on pruning the history, but are you committed to doing this way? Not necessarily. Only the starting point ending point really matter; I'd like to be able to start with a transcript of a bunch of aimless commands (e.g. the output of sink() or of ess-transcript-clean-buffer or a history transcript), and end up with a nice focused handful of commands suitable for showing to other people. I'm sure the final goal can't typically be achieved fully automatically, but some kind of support from tools would be great. Thanks for mentioning plot2script() and the TeachingDemos package, those are indeed nice examples to look at. -- Ken Williams Research Scientist The Thomson Reuters Corporation Eagan, MN __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] History pruning
On 8/1/08 1:13 PM, Richard M. Heiberger [EMAIL PROTECTED] wrote: I meant 5a 5b 5c. Multiple-line commands are handled correctly. What is is doing is looking for and + prompts. Anything else is removed. When I said 5c) prune any lines that don't have assignment operators I meant to take a sequence like this (to pick a semi-random chunk from my history log): --- df - data.frame(x=2:9, y=(1:8)^2) cor(df) ?cor mad(c(1:9)) ?reshape a - matrix(1:12, nrow=3) b - matrix(2:13, nrow=3) b - matrix(4:15, nrow=3) b - matrix(2:13, nrow=3) c - matrix(4:15, nrow=3) a b c --- And turn it into this: --- df - data.frame(x=2:9, y=(1:8)^2) a - matrix(1:12, nrow=3) b - matrix(2:13, nrow=3) b - matrix(4:15, nrow=3) b - matrix(2:13, nrow=3) c - matrix(4:15, nrow=3) --- Obviously I wouldn't *always* want this performed, but selectively it would be quite nice. Further, if the dependency graph among variable definitions were computable, the sequence could be reduced to this: --- df - data.frame(x=2:9, y=(1:8)^2) a - matrix(1:12, nrow=3) b - matrix(2:13, nrow=3) c - matrix(4:15, nrow=3) --- Note that the starting point of all of this is a sequence of commands (the output of savehistory(), so separating commands from output isn't necessary. I've made a bit of progress on this, hopefully I can get clearance to show my work soon. It would be nice if this could be hooked into ESS for selective pruning or something. -Ken -- Ken Williams Research Scientist The Thomson Reuters Corporation Eagan, MN __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] History pruning
JGR's Copy Commands command works well for me (even if it is both fascinating and embarrassing how little is sometimes left over). It retains only commands that worked, so it is still not the minimum possible. Antony Unwin Professor of Computer-Oriented Statistics and Data Analysis, Mathematics Institute, University of Augsburg, 86135 Augsburg, Germany Tel: + 49 821 5982218 [EMAIL PROTECTED] http://stats.math.uni-augsburg.de/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] History pruning
5a) save my entire history to a text file 5b) open it up in Emacs 5c) prune any lines that don't have assignment operators Ken Williams Research Scientist The Thomson Reuters Corporation Eagan, MN No one has yet mentioned the obvious. ESS does your 5a 5b 5c with M-x ess-transcript-clean-buffer It works in either the *R* buffer or a *.rt or *.st buffer. It handles multiple-line commands correctly. Make sure the buffer is writable (C-x C-q on the *.rt buffer) M-x ess-transcript-clean-buffer Save the buffer as a *.r file. On automatic content analysis, that is tougher. I would be scared to do your 5d) prune any plotting commands that were superseded by later plots because I don't know what supersede means. I can imagine situations, for example, par(mfrow=c(1,2)) plot(y ~ x) x - x + 1 plot(y ~ x) where I want to keep both plots. You also have to trust that there are no side effects, which I wouldn't want to do, because plot() changes the value of par() parameters. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] History pruning
On 8/1/08 12:40 PM, Richard M. Heiberger [EMAIL PROTECTED] wrote: 5a) save my entire history to a text file 5b) open it up in Emacs 5c) prune any lines that don't have assignment operators No one has yet mentioned the obvious. ESS does your 5a 5b 5c with M-x ess-transcript-clean-buffer I think you mean just 5a 5b, right? Lines with syntax errors are (I think) removed, but that's it. That part is relatively easy to perform as the first step of a tool, just by running commands through R's parse() and discarding anything that throws an exception. On automatic content analysis, that is tougher. I would be scared to do your 5d) prune any plotting commands that were superseded by later plots True. There are lots of (perhaps relatively common) edge cases that would have to be taken into account. Perhaps a more interactive approach would be better, something like get rid of this plot command and all subsequent modifications to its canvas. Not sure. My basic philosophy on stuff like this is, given the choice of me fumbling around using tools and me fumbling around without using tools, I tend to do better when I have tools. You also have to trust that there are no side effects, which I wouldn't want to do, because plot() changes the value of par() parameters. It does? I wasn't aware of that, could you give an example? -- Ken Williams Research Scientist The Thomson Reuters Corporation Eagan, MN __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] History pruning
I meant 5a 5b 5c. Multiple-line commands are handled correctly. What is is doing is looking for and + prompts. Anything else is removed. Here is a selection from the *R* buffer and the result after cleaning. It includes an example of par(). Rich *R* options(chmhelp = FALSE) options(STERM='iESS', editor='gnuclient.exe') par()$usr [1] 0 1 0 1 plot(1:10) par()$usr [1] 0.64 10.36 0.64 10.36 a - + 3+4 After cleaning options(chmhelp = FALSE) options(STERM='iESS', editor='gnuclient.exe') par()$usr plot(1:10) par()$usr a - 3+4 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] History pruning
On 7/30/08 1:59 PM, Marc Schwartz [EMAIL PROTECTED] wrote: I (and many others) use ESS (Emacs Speaks Statistics), in which case, I have an R source buffer in the upper frame and an R session in the lower frame. I also use ESS to edit my R code (inside Aquamacs Emacs), but I usually use the OS X port R.app for most of my interactive sessions. Together I think those give me roughly the same amount of IDE-like support as you've got in your setup. I think the ess-smart-underscore command alone is worth the price of admission. But none of that directly addresses the issue of automatically (or semi-automatically) taking a long sequence of commands and pruning it down to a smaller sequence that produces the same results. Theoretically the allowable prunings would be akin to those of a good optimizer. And the R language would seem fairly amenable to such things, with its pass-by-value functional semantics, etc. -- Ken Williams Research Scientist The Thomson Reuters Corporation Eagan, MN __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] History pruning
on 07/31/2008 08:35 AM Ken Williams wrote: On 7/30/08 1:59 PM, Marc Schwartz [EMAIL PROTECTED] wrote: I (and many others) use ESS (Emacs Speaks Statistics), in which case, I have an R source buffer in the upper frame and an R session in the lower frame. I also use ESS to edit my R code (inside Aquamacs Emacs), but I usually use the OS X port R.app for most of my interactive sessions. Together I think those give me roughly the same amount of IDE-like support as you've got in your setup. I think the ess-smart-underscore command alone is worth the price of admission. But none of that directly addresses the issue of automatically (or semi-automatically) taking a long sequence of commands and pruning it down to a smaller sequence that produces the same results. Theoretically the allowable prunings would be akin to those of a good optimizer. And the R language would seem fairly amenable to such things, with its pass-by-value functional semantics, etc. On that point, I would need to defer to others as to any pre-existing tools. I am not aware of any at this point, but that does not mean that none exist. Sounds like the basis of an interesting research project. Marc P.S. I just noted the Eagan, MN in your sig. I live in Eden Prairie... __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] History pruning
On 7/31/08 11:01 AM, hadley wickham [EMAIL PROTECTED] wrote: I think that would be a very hard task - Well, at least medium-hard. But I think significant automatic steps could be made, and then a human can take over for the last few steps. That's why I was enquiring about tools rather than a complete solution. Does R provide facilities for introspection or interrogation of expression objects? I couldn't find anything useful on first look: methods(class=expression) no methods were found dput(expression(foo - 5 * bar)) expression(foo - 5 * bar) str(expression(foo - 5 * bar)) expression(foo - 5 * bar) it's equivalent to taking a long rambling conversation and then automatically turning it into a concise summary of what was said. I think you must have human intervention. It's not really equivalent, natural language has ambiguities and subtleties that computer languages, especially functional languages, intentionally don't have. By their nature, computer languages can be turned into parse trees unambiguously and then those trees can be manipulated. But coincidentally I work in a Natural Language Processing group, and one of the things we do is create exactly the kind of concise summaries you describe. =) -- Ken Williams Research Scientist The Thomson Reuters Corporation Eagan, MN __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] History pruning
On 7/31/2008 2:08 PM, Ken Williams wrote: On 7/31/08 11:01 AM, hadley wickham [EMAIL PROTECTED] wrote: I think that would be a very hard task - Well, at least medium-hard. But I think significant automatic steps could be made, and then a human can take over for the last few steps. That's why I was enquiring about tools rather than a complete solution. Does R provide facilities for introspection or interrogation of expression objects? I couldn't find anything useful on first look: You can index an expression as a list: e - expression(foo - 5 * bar) e[[1]] foo - 5 * bar str(e[[1]]) language foo - 5 * bar expression() returns a list of language objects, and we only asked for one. We can look inside it: e[[1]][[1]] `-` The as.list function is also useful: as.list(e[[1]]) [[1]] `-` [[2]] foo [[3]] 5 * bar and proceed recursively: as.list(e[[1]][[3]]) [[1]] `*` [[2]] [1] 5 [[3]] bar Duncan Murdoch methods(class=expression) no methods were found dput(expression(foo - 5 * bar)) expression(foo - 5 * bar) str(expression(foo - 5 * bar)) expression(foo - 5 * bar) it's equivalent to taking a long rambling conversation and then automatically turning it into a concise summary of what was said. I think you must have human intervention. It's not really equivalent, natural language has ambiguities and subtleties that computer languages, especially functional languages, intentionally don't have. By their nature, computer languages can be turned into parse trees unambiguously and then those trees can be manipulated. But coincidentally I work in a Natural Language Processing group, and one of the things we do is create exactly the kind of concise summaries you describe. =) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] History pruning
On 7/31/08 2:12 PM, Duncan Murdoch [EMAIL PROTECTED] wrote: expression() returns a list of language objects, and we only asked for one. We can look inside it: Hey, cool. Now let me see if I can do anything useful with that. Thanks. -Ken -- Ken Williams Research Scientist The Thomson Reuters Corporation Eagan, MN __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] History pruning
It's not really equivalent, natural language has ambiguities and subtleties that computer languages, especially functional languages, intentionally don't have. By their nature, computer languages can be turned into parse trees unambiguously and then those trees can be manipulated. But in some ways that makes things easier - i.e. you don't expect to be able to summarise a conversation/paper/book in a way that completely recreates it - some ambiguity is unavoidable. But coincidentally I work in a Natural Language Processing group, and one of the things we do is create exactly the kind of concise summaries you describe. =) Well good luck! And I'll be interested to see anything you come up with. Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] History pruning
Hi, I find that a typical workflow for me looks something like this: 1) import some data from files 2) mess around with the data for a while 3) mess around with plotting for a while 4) get a plot or analysis that looks good 5) go back through my history to make a list of the shortest command sequence to recreate the plot or analysis 6) send out that sequence to colleagues, along with the generated plots or analysis output I wonder if there are any tools people have developed to help with step 5. Typically I do something like this: 5a) save my entire history to a text file 5b) open it up in Emacs 5c) prune any lines that don't have assignment operators 5d) prune any plotting commands that were superseded by later plots and then start on other more subtle stuff like pruning assignments that were later overwritten, unless the later assignments have variable overlap between the LHS and the RHS. Then I just start eyeballing it. Would any deeper introspection of the history expressions be feasible, e.g. detecting statements that have no side effects, dead ends, etc. The holy grail would be something like show me all the statements that contributed to the current plot or the like. Thanks. -- Ken Williams Research Scientist The Thomson Reuters Corporation Eagan, MN __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] History pruning
on 07/30/2008 01:12 PM Ken Williams wrote: Hi, I find that a typical workflow for me looks something like this: 1) import some data from files 2) mess around with the data for a while 3) mess around with plotting for a while 4) get a plot or analysis that looks good 5) go back through my history to make a list of the shortest command sequence to recreate the plot or analysis 6) send out that sequence to colleagues, along with the generated plots or analysis output I wonder if there are any tools people have developed to help with step 5. Typically I do something like this: 5a) save my entire history to a text file 5b) open it up in Emacs 5c) prune any lines that don't have assignment operators 5d) prune any plotting commands that were superseded by later plots and then start on other more subtle stuff like pruning assignments that were later overwritten, unless the later assignments have variable overlap between the LHS and the RHS. Then I just start eyeballing it. Would any deeper introspection of the history expressions be feasible, e.g. detecting statements that have no side effects, dead ends, etc. The holy grail would be something like show me all the statements that contributed to the current plot or the like. Thanks. I (and many others) use ESS (Emacs Speaks Statistics), in which case, I have an R source buffer in the upper frame and an R session in the lower frame. In my particular case, I also happen to use ECB (Emacs Code Browser) which also has a left hand column spanning the full vertical length, to provide access to other things (file browser, R function and data objects, etc.). It also helps integrate Sweave/LaTeX functionality to further centralize things and increase productivity. I have also tied in Subversion functionality to enable me to engage in version control of my code and other key files. I do all of my editing in the upper frame and use the built-in ESS functions to submit the code to the R session. This also provides for code syntax highlighting, which makes it easier to visualize code as well as to check for things like matching parens/braces, etc. In this way, your working code (including comments) is kept functionally intact in the upper frame and you can edit and use it without having to scroll through a long history of commands (which is still there if you need it). More information here: http://ess.r-project.org/ HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.