Re: [R] History pruning

2008-08-22 Thread Richard M. Heiberger
We have different starting points.  Please be sure that your modularity
allows a cleaned region as well as a history log to be the input to your
next
step.  The history log is incomplete; lines sent to the *R* buffer by C-c
C-n are
explicitly excluded from history.  Lines picked up from a saved transcript
file aren't in history.  Will your program handle this correctly:

aa - bb +
cc

?  It is valid code.  Suppressing the line with cc will make it not valid
code.


Rich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] History pruning

2008-08-08 Thread Ken Williams



On 8/8/08 1:04 PM, Greg Snow [EMAIL PROTECTED] wrote:

 Ken,
 
 Others have given hints on pruning the history, but are you committed to doing
 this way?

Not necessarily.  Only the starting point  ending point really matter; I'd
like to be able to start with a transcript of a bunch of aimless commands
(e.g. the output of sink() or of ess-transcript-clean-buffer or a history
transcript), and end up with a nice focused handful of commands suitable for
showing to other people.  I'm sure the final goal can't typically be
achieved fully automatically, but some kind of support from tools would be
great.

Thanks for mentioning plot2script() and the TeachingDemos package, those are
indeed nice examples to look at.

-- 
Ken Williams
Research Scientist
The Thomson Reuters Corporation
Eagan, MN

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] History pruning

2008-08-04 Thread Ken Williams



On 8/1/08 1:13 PM, Richard M. Heiberger [EMAIL PROTECTED] wrote:

 I meant 5a 5b 5c.  Multiple-line commands are handled correctly.
 What is is doing is looking for   and  + prompts.  Anything else
 is removed.

When I said 5c) prune any lines that don't have assignment operators I
meant to take a sequence like this (to pick a semi-random chunk from my
history log):

---
df - data.frame(x=2:9, y=(1:8)^2)
cor(df)
?cor
mad(c(1:9))
?reshape
a - matrix(1:12, nrow=3)
b - matrix(2:13, nrow=3)
b - matrix(4:15, nrow=3)
b - matrix(2:13, nrow=3)
c - matrix(4:15, nrow=3)
a
b
c
---

And turn it into this:

---
df - data.frame(x=2:9, y=(1:8)^2)
a - matrix(1:12, nrow=3)
b - matrix(2:13, nrow=3)
b - matrix(4:15, nrow=3)
b - matrix(2:13, nrow=3)
c - matrix(4:15, nrow=3)
---

Obviously I wouldn't *always* want this performed, but selectively it would
be quite nice.

Further, if the dependency graph among variable definitions were computable,
the sequence could be reduced to this:

---
df - data.frame(x=2:9, y=(1:8)^2)
a - matrix(1:12, nrow=3)
b - matrix(2:13, nrow=3)
c - matrix(4:15, nrow=3)
---

Note that the starting point of all of this is a sequence of commands (the
output of savehistory(), so separating commands from output isn't necessary.

I've made a bit of progress on this, hopefully I can get clearance to show
my work soon.  It would be nice if this could be hooked into ESS for
selective pruning or something.

 -Ken


-- 
Ken Williams
Research Scientist
The Thomson Reuters Corporation
Eagan, MN

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] History pruning

2008-08-01 Thread Antony Unwin
JGR's Copy Commands command works well for me (even if it is both  
fascinating and embarrassing how little is sometimes left over).  It  
retains only commands that worked, so it is still not the minimum  
possible.

Antony Unwin
Professor of Computer-Oriented Statistics and Data Analysis,
Mathematics Institute,
University of Augsburg,
86135 Augsburg, Germany
Tel: + 49 821 5982218

[EMAIL PROTECTED]

http://stats.math.uni-augsburg.de/




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] History pruning

2008-08-01 Thread Richard M. Heiberger

5a) save my entire history to a text file
5b) open it up in Emacs
5c) prune any lines that don't have assignment operators


Ken Williams
Research Scientist
The Thomson Reuters Corporation
Eagan, MN


No one has yet mentioned the obvious.  ESS does your 5a 5b 5c with
   M-x ess-transcript-clean-buffer
It works in either the *R* buffer or a *.rt or *.st buffer.
It handles multiple-line commands correctly.

Make sure the buffer is writable (C-x C-q on the *.rt buffer)
M-x ess-transcript-clean-buffer
Save the buffer as a *.r file.



On automatic content analysis, that is tougher. I would be scared to do your
5d) prune any plotting commands that were superseded by later plots

because I don't know what supersede means.  I can imagine situations, for
example,
par(mfrow=c(1,2))
plot(y ~ x)
x - x + 1
plot(y ~ x)
where I want to keep both plots.
You also have to trust that there are no side effects, which I wouldn't
want to do, because plot() changes the value of par() parameters.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] History pruning

2008-08-01 Thread Ken Williams



On 8/1/08 12:40 PM, Richard M. Heiberger [EMAIL PROTECTED] wrote:

 
 5a) save my entire history to a text file
 5b) open it up in Emacs
 5c) prune any lines that don't have assignment operators

 No one has yet mentioned the obvious.  ESS does your 5a 5b 5c with
M-x ess-transcript-clean-buffer

I think you mean just 5a  5b, right?  Lines with syntax errors are (I
think) removed, but that's it.

That part is relatively easy to perform as the first step of a tool, just by
running commands through R's parse() and discarding anything that throws an
exception.

 On automatic content analysis, that is tougher. I would be scared to do your
 5d) prune any plotting commands that were superseded by later plots

True.  There are lots of (perhaps relatively common) edge cases that would
have to be taken into account.  Perhaps a more interactive approach would be
better, something like get rid of this plot command and all subsequent
modifications to its canvas.  Not sure.

My basic philosophy on stuff like this is, given the choice of me fumbling
around using tools and me fumbling around without using tools, I tend to do
better when I have tools.

 You also have to trust that there are no side effects, which I wouldn't
 want to do, because plot() changes the value of par() parameters.

It does?  I wasn't aware of that, could you give an example?


-- 
Ken Williams
Research Scientist
The Thomson Reuters Corporation
Eagan, MN

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] History pruning

2008-08-01 Thread Richard M. Heiberger
I meant 5a 5b 5c.  Multiple-line commands are handled correctly.
What is is doing is looking for   and  + prompts.  Anything else
is removed.

Here is a selection from the *R* buffer and the result after cleaning.
It includes an example of par().

Rich


*R*
 options(chmhelp = FALSE)
 options(STERM='iESS', editor='gnuclient.exe')
 par()$usr
[1] 0 1 0 1
 plot(1:10)
 par()$usr
[1]  0.64 10.36  0.64 10.36
 a -
+ 3+4


After cleaning
options(chmhelp = FALSE)
options(STERM='iESS', editor='gnuclient.exe')
par()$usr
plot(1:10)
par()$usr
a -
3+4

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] History pruning

2008-07-31 Thread Ken Williams



On 7/30/08 1:59 PM, Marc Schwartz [EMAIL PROTECTED] wrote:

 I (and many others) use ESS (Emacs Speaks Statistics), in which case, I
 have an R source buffer in the upper frame and an R session in the lower
 frame.

I also use ESS to edit my R code (inside Aquamacs Emacs), but I usually use
the OS X port R.app for most of my interactive sessions.  Together I think
those give me roughly the same amount of IDE-like support as you've got in
your setup.

I think the ess-smart-underscore command alone is worth the price of
admission.

But none of that directly addresses the issue of automatically (or
semi-automatically) taking a long sequence of commands and pruning it down
to a smaller sequence that produces the same results.  Theoretically the
allowable prunings would be akin to those of a good optimizer.  And the R
language would seem fairly amenable to such things, with its pass-by-value
functional semantics, etc.

-- 
Ken Williams
Research Scientist
The Thomson Reuters Corporation
Eagan, MN

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] History pruning

2008-07-31 Thread Marc Schwartz

on 07/31/2008 08:35 AM Ken Williams wrote:



On 7/30/08 1:59 PM, Marc Schwartz [EMAIL PROTECTED] wrote:


I (and many others) use ESS (Emacs Speaks Statistics), in which case, I
have an R source buffer in the upper frame and an R session in the lower
frame.


I also use ESS to edit my R code (inside Aquamacs Emacs), but I usually use
the OS X port R.app for most of my interactive sessions.  Together I think
those give me roughly the same amount of IDE-like support as you've got in
your setup.

I think the ess-smart-underscore command alone is worth the price of
admission.

But none of that directly addresses the issue of automatically (or
semi-automatically) taking a long sequence of commands and pruning it down
to a smaller sequence that produces the same results.  Theoretically the
allowable prunings would be akin to those of a good optimizer.  And the R
language would seem fairly amenable to such things, with its pass-by-value
functional semantics, etc.


On that point, I would need to defer to others as to any pre-existing 
tools. I am not aware of any at this point, but that does not mean that 
none exist.  Sounds like the basis of an interesting research project.


Marc

P.S. I just noted the Eagan, MN in your sig. I live in Eden Prairie...

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] History pruning

2008-07-31 Thread Ken Williams



On 7/31/08 11:01 AM, hadley wickham [EMAIL PROTECTED] wrote:

 I think that would be a very hard task -

Well, at least medium-hard.  But I think significant automatic steps could
be made, and then a human can take over for the last few steps.  That's why
I was enquiring about tools rather than a complete solution.

Does R provide facilities for introspection or interrogation of expression
objects?  I couldn't find anything useful on first look:

 methods(class=expression)
no methods were found
 dput(expression(foo  - 5 * bar))
expression(foo - 5 * bar)
 str(expression(foo - 5 * bar))
  expression(foo - 5 * bar)
 

 it's equivalent to taking a
 long rambling conversation and then automatically turning it into a
 concise summary of what was said.  I think you must have human
 intervention.

It's not really equivalent, natural language has ambiguities and subtleties
that computer languages, especially functional languages, intentionally
don't have.  By their nature, computer languages can be turned into parse
trees unambiguously and then those trees can be manipulated.

But coincidentally I work in a Natural Language Processing group, and one of
the things we do is create exactly the kind of concise summaries you
describe. =)

-- 
Ken Williams
Research Scientist
The Thomson Reuters Corporation
Eagan, MN

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] History pruning

2008-07-31 Thread Duncan Murdoch

On 7/31/2008 2:08 PM, Ken Williams wrote:



On 7/31/08 11:01 AM, hadley wickham [EMAIL PROTECTED] wrote:


I think that would be a very hard task -


Well, at least medium-hard.  But I think significant automatic steps could
be made, and then a human can take over for the last few steps.  That's why
I was enquiring about tools rather than a complete solution.

Does R provide facilities for introspection or interrogation of expression
objects?  I couldn't find anything useful on first look:


You can index an expression as a list:

 e - expression(foo - 5 * bar)

 e[[1]]
foo - 5 * bar
 str(e[[1]])
 language foo - 5 * bar

expression() returns a list of language objects, and we only asked for 
one.  We can look inside it:


 e[[1]][[1]]
`-`

The as.list function is also useful:

 as.list(e[[1]])
[[1]]
`-`

[[2]]
foo

[[3]]
5 * bar

and proceed recursively:

 as.list(e[[1]][[3]])
[[1]]
`*`

[[2]]
[1] 5

[[3]]
bar


Duncan Murdoch


 methods(class=expression)
 no methods were found
 dput(expression(foo  - 5 * bar))
 expression(foo - 5 * bar)
 str(expression(foo - 5 * bar))
   expression(foo - 5 * bar)


 it's equivalent to taking a
 long rambling conversation and then automatically turning it into a
 concise summary of what was said.  I think you must have human
 intervention.

 It's not really equivalent, natural language has ambiguities and 
subtleties

 that computer languages, especially functional languages, intentionally
 don't have.  By their nature, computer languages can be turned into parse
 trees unambiguously and then those trees can be manipulated.

 But coincidentally I work in a Natural Language Processing group, and 
one of

 the things we do is create exactly the kind of concise summaries you
 describe. =)


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] History pruning

2008-07-31 Thread Ken Williams



On 7/31/08 2:12 PM, Duncan Murdoch [EMAIL PROTECTED] wrote:

 
 expression() returns a list of language objects, and we only asked for
 one.  We can look inside it:

Hey, cool.  Now let me see if I can do anything useful with that.  Thanks.

 -Ken

-- 
Ken Williams
Research Scientist
The Thomson Reuters Corporation
Eagan, MN

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] History pruning

2008-07-31 Thread hadley wickham
 It's not really equivalent, natural language has ambiguities and subtleties
 that computer languages, especially functional languages, intentionally
 don't have.  By their nature, computer languages can be turned into parse
 trees unambiguously and then those trees can be manipulated.

But in some ways that makes things easier - i.e. you don't expect to
be able to summarise a conversation/paper/book in a way that
completely recreates it - some ambiguity is unavoidable.

 But coincidentally I work in a Natural Language Processing group, and one of
 the things we do is create exactly the kind of concise summaries you
 describe. =)

Well good luck!  And I'll be interested to see anything you come up with.

Hadley


-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] History pruning

2008-07-30 Thread Ken Williams
Hi,

I find that a typical workflow for me looks something like this:

1) import some data from files
2) mess around with the data for a while
3) mess around with plotting for a while
4) get a plot or analysis that looks good
5) go back through my history to make a list of the shortest command
sequence to recreate the plot or analysis
6) send out that sequence to colleagues, along with the generated plots
or analysis output

I wonder if there are any tools people have developed to help with step
5.  Typically I do something like this:

5a) save my entire history to a text file
5b) open it up in Emacs
5c) prune any lines that don't have assignment operators
5d) prune any plotting commands that were superseded by later plots

and then start on other more subtle stuff like pruning assignments that
were later overwritten, unless the later assignments have variable
overlap between the LHS and the RHS.  Then I just start eyeballing it.

Would any deeper introspection of the history expressions be feasible,
e.g. detecting statements that have no side effects, dead ends, etc.

The holy grail would be something like show me all the statements that
contributed to the current plot or the like.

Thanks.

-- 
Ken Williams
Research Scientist
The Thomson Reuters Corporation
Eagan, MN

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] History pruning

2008-07-30 Thread Marc Schwartz

on 07/30/2008 01:12 PM Ken Williams wrote:

Hi,

I find that a typical workflow for me looks something like this:

1) import some data from files
2) mess around with the data for a while
3) mess around with plotting for a while
4) get a plot or analysis that looks good
5) go back through my history to make a list of the shortest command
sequence to recreate the plot or analysis
6) send out that sequence to colleagues, along with the generated plots
or analysis output

I wonder if there are any tools people have developed to help with step
5.  Typically I do something like this:

5a) save my entire history to a text file
5b) open it up in Emacs
5c) prune any lines that don't have assignment operators
5d) prune any plotting commands that were superseded by later plots

and then start on other more subtle stuff like pruning assignments that
were later overwritten, unless the later assignments have variable
overlap between the LHS and the RHS.  Then I just start eyeballing it.

Would any deeper introspection of the history expressions be feasible,
e.g. detecting statements that have no side effects, dead ends, etc.

The holy grail would be something like show me all the statements that
contributed to the current plot or the like.

Thanks.


I (and many others) use ESS (Emacs Speaks Statistics), in which case, I 
have an R source buffer in the upper frame and an R session in the lower 
frame.


In my particular case, I also happen to use ECB (Emacs Code Browser) 
which also has a left hand column spanning the full vertical length, to 
provide access to other things (file browser, R function and data 
objects, etc.). It also helps integrate Sweave/LaTeX functionality to 
further centralize things and increase productivity. I have also tied in 
Subversion functionality to enable me to engage in version control of my 
code and other key files.


I do all of my editing in the upper frame and use the built-in ESS 
functions to submit the code to the R session. This also provides for 
code syntax highlighting, which makes it easier to visualize code as 
well as to check for things like matching parens/braces, etc.


In this way, your working code (including comments) is kept functionally 
intact in the upper frame and you can edit and use it without having to 
scroll through a long history of commands (which is still there if you 
need it).


More information here:

  http://ess.r-project.org/

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.