[Rd] R tcltk Gui and Rpy2

2012-03-04 Thread branch.lizard
I have a tcltk gui I created in R. I can open R and use the source command to
open the gui and then click buttons to perform functions. I would like to
make this a standalone program where the user does not have to open R and
type the source command to run the tcltk gui. I have a very nasty workaround
working for me which I use the R_PROFILE command to run the gui however I
would like to make the gui into a python executable program. I am very new
to python, but quite familiar with R. How does one go about doing this? I
have dabbled in Rpy2 a little and used the source command within python to
get the gui to load however it does not allow me to click the buttons b/c
it seems that the python script sessions closes at ). Forgive me for my
terrible explanation of this. If you do not understand what I am asking,
please say so and I will try to provide some incite. 

--
View this message in context: 
http://r.789695.n4.nabble.com/R-tcltk-Gui-and-Rpy2-tp4442989p4442989.html
Sent from the R devel mailing list archive at Nabble.com.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Conditional means and variances of a multivariate normal distribution

2012-03-04 Thread Ravi Varadhan
Hi,



Let X = (x_1, x_2, ... ,  x_p) be multivariate normal with mean, mu = (mu_1, 
... , mu_p) and covariance = Sigma.  I was looking for an R function to compute 
conditional mean and conditional variance of a given subset of X given another 
subset of X.  While this is trivially easy to do, there is nothing in base 
for doing this, at least nothing that I am aware of.  I am also not aware of 
anything in the contributed packages (although my search was not 
comprehensive).  I feel that this would be a useful addition, if it is not 
already there.  I have written this following function, which I am sure can be 
improved a lot (including better argument names!). I would like to hear your 
thought on this.



condNormal - function(x.given, mu, sigma, req.ind, given.ind){

# Returns conditional mean and variance of x[req.ind]

# Given x[given.ind] = x.given

# where X is multivariate Normal with

# mean = mu and covariance = sigma

#

B - sigma[req.ind, req.ind]

C - sigma[req.ind, given.ind]

D - sigma[given.ind, given.ind]

cMu - drop(mu[req.ind] + C %*% solve(D) %*% (x.given - mu[given.ind]))

cVar - B - C %*% solve(D) %*% t(C)

list(condMean=cMu, condVar=cVar)

}



n - 10

A - matrix(rnorm(n^2), n, n)

A - A %*% t(A)

condNormal(x=c(1,1,0,0,-1), mu=rep(1,n), sigma=A, req=c(2,3,5), 
given=c(1,4,7,9,10))



Best regards,

Ravi

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] hash table clean-up

2012-03-04 Thread Florent D.
Hello,

I have noticed that the memory usage inside an R session increases as
more and more objects with unique names are created, even after they
are removed. Here is a small reproducible example:

 gc()
 used (Mb) gc trigger (Mb) max used (Mb)
Ncells 531720 14.2 899071 24.1   818163 21.9
Vcells 247949  1.9 786432  6.0   641735  4.9

 for (i in 1:10) {
+ name - paste(x, runif(1), sep=)
+ assign(name, NULL)
+ rm(list=name)
+ rm(name)
}

 gc()
 used (Mb) gc trigger (Mb) max used (Mb)
Ncells 831714 22.31368491 36.6  1265230 33.8
Vcells 680551  5.21300721 10.0   969572  7.4

It appears the increase in memory usage is due to the way R's
environment hash table operates
(http://cran.r-project.org/doc/manuals/R-ints.html#Hash-table): as
objects with new names are created, new entries are made in the hash
table; but when the objects are removed from the environment, the
corresponding entries are not deleted.

I hope you will agree the growth in memory size is an undesirable
feature and can address the issue in a future release. If not, please
let me know why you think it should remain this way.

I believe a fix could be made around the time the hash table is
resized, where only non-removed items would be kept. I can try to make
those changes to src/main/envir.c myself, but C is not my area of
expertise. So if you beat me to it, please let me know.

Thank you,
Florent.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] hash table clean-up

2012-03-04 Thread luke-tierney

On Sun, 4 Mar 2012, Florent D. wrote:


Hello,

I have noticed that the memory usage inside an R session increases as
more and more objects with unique names are created, even after they
are removed. Here is a small reproducible example:


gc()

used (Mb) gc trigger (Mb) max used (Mb)
Ncells 531720 14.2 899071 24.1   818163 21.9
Vcells 247949  1.9 786432  6.0   641735  4.9


for (i in 1:10) {

+ name - paste(x, runif(1), sep=)
+ assign(name, NULL)
+ rm(list=name)
+ rm(name)
}


gc()

used (Mb) gc trigger (Mb) max used (Mb)
Ncells 831714 22.31368491 36.6  1265230 33.8
Vcells 680551  5.21300721 10.0   969572  7.4

It appears the increase in memory usage is due to the way R's
environment hash table operates
(http://cran.r-project.org/doc/manuals/R-ints.html#Hash-table): as
objects with new names are created, new entries are made in the hash
table; but when the objects are removed from the environment, the
corresponding entries are not deleted.


Your analysis is incorrect. What you are seeing is the fact that thea
symbol or name objects used as keys are being added to the global
symbol table and that is not garbage collected. I believe that too
many internals rely on this for it to be changed any time soon.  It
may be possible to have some symbols GC protected and others not, but
again that would require very careful throught and implementation and
isn't likely to be a priority anty time soon as far as I can see.

There may be some value in having hash tables that use some form of
uninterned symbols as keys at some point but that is a larger project
that might be better provided by a contributed package, at least
initially.

Best,

luke



I hope you will agree the growth in memory size is an undesirable
feature and can address the issue in a future release. If not, please
let me know why you think it should remain this way.

I believe a fix could be made around the time the hash table is
resized, where only non-removed items would be kept. I can try to make
those changes to src/main/envir.c myself, but C is not my area of
expertise. So if you beat me to it, please let me know.

Thank you,
Florent.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Luke Tierney
Chair, Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
   Actuarial Science
241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] hash table clean-up

2012-03-04 Thread Gabor Grothendieck
On Sun, Mar 4, 2012 at 4:19 PM,  luke-tier...@uiowa.edu wrote:
 On Sun, 4 Mar 2012, Florent D. wrote:

 Hello,

 I have noticed that the memory usage inside an R session increases as
 more and more objects with unique names are created, even after they
 are removed. Here is a small reproducible example:

 gc()

        used (Mb) gc trigger (Mb) max used (Mb)
 Ncells 531720 14.2     899071 24.1   818163 21.9
 Vcells 247949  1.9     786432  6.0 641735 4.9


 for (i in 1:10) {

 + name - paste(x, runif(1), sep=)
 + assign(name, NULL)
 + rm(list=name)
 + rm(name)
 }


 gc()

        used (Mb) gc trigger (Mb) max used (Mb)
 Ncells 831714 22.3    1368491 36.6  1265230 33.8
 Vcells 680551  5.2    1300721 10.0   969572  7.4

 It appears the increase in memory usage is due to the way R's
 environment hash table operates
 (http://cran.r-project.org/doc/manuals/R-ints.html#Hash-table): as
 objects with new names are created, new entries are made in the hash
 table; but when the objects are removed from the environment, the
 corresponding entries are not deleted.


 Your analysis is incorrect. What you are seeing is the fact that thea
 symbol or name objects used as keys are being added to the global
 symbol table and that is not garbage collected. I believe that too
 many internals rely on this for it to be changed any time soon.  It
 may be possible to have some symbols GC protected and others not, but
 again that would require very careful throught and implementation and
 isn't likely to be a priority anty time soon as far as I can see.

 There may be some value in having hash tables that use some form of
 uninterned symbols as keys at some point but that is a larger project
 that might be better provided by a contributed package, at least
 initially.


Does this apply to lists too or just environments?


-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] hash table clean-up

2012-03-04 Thread Simon Urbanek

On Mar 4, 2012, at 4:40 PM, Gabor Grothendieck wrote:

 On Sun, Mar 4, 2012 at 4:19 PM,  luke-tier...@uiowa.edu wrote:
 On Sun, 4 Mar 2012, Florent D. wrote:
 
 Hello,
 
 I have noticed that the memory usage inside an R session increases as
 more and more objects with unique names are created, even after they
 are removed. Here is a small reproducible example:
 
 gc()
 
used (Mb) gc trigger (Mb) max used (Mb)
 Ncells 531720 14.2 899071 24.1   818163 21.9
 Vcells 247949  1.9 786432  6.0 641735 4.9
 
 
 for (i in 1:10) {
 
 + name - paste(x, runif(1), sep=)
 + assign(name, NULL)
 + rm(list=name)
 + rm(name)
 }
 
 
 gc()
 
used (Mb) gc trigger (Mb) max used (Mb)
 Ncells 831714 22.31368491 36.6  1265230 33.8
 Vcells 680551  5.21300721 10.0   969572  7.4
 
 It appears the increase in memory usage is due to the way R's
 environment hash table operates
 (http://cran.r-project.org/doc/manuals/R-ints.html#Hash-table): as
 objects with new names are created, new entries are made in the hash
 table; but when the objects are removed from the environment, the
 corresponding entries are not deleted.
 
 
 Your analysis is incorrect. What you are seeing is the fact that thea
 symbol or name objects used as keys are being added to the global
 symbol table and that is not garbage collected. I believe that too
 many internals rely on this for it to be changed any time soon.  It
 may be possible to have some symbols GC protected and others not, but
 again that would require very careful throught and implementation and
 isn't likely to be a priority anty time soon as far as I can see.
 
 There may be some value in having hash tables that use some form of
 uninterned symbols as keys at some point but that is a larger project
 that might be better provided by a contributed package, at least
 initially.
 
 
 Does this apply to lists too or just environments?
 

Just environments and pairlists (the latter don't use hashing, though). Lists 
(i.e. generic vectors) are not keyed by symbols (but are not hashed, either).

Cheers,
S

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] rpart package, text function, and round of class counts

2012-03-04 Thread yindalon
I run the following code:

library(rpart)
data(kyphosis)
fit - rpart(Kyphosis ~ ., data=kyphosis)
plot(fit)
text(fit, use.n=TRUE)

The text labels represent the count of each class at the leaf node.
Unfortunately, the numbers are rounded and in scientific notation rather
than the exact number of examples sorted by that node in each class. 

The plot is supposed to look like
http://www.statmethods.net/advstats/images/ctree.png as per
http://www.statmethods.net/advstats/cart.html.

I'm running 2.14.1 on a mac.

Can anyone verify or point out if I am doing something obviously wrong for
displaying the counts rounded and in scientific notation rather than the
true counts in each class at each node?
Thanks.

--
View this message in context: 
http://r.789695.n4.nabble.com/rpart-package-text-function-and-round-of-class-counts-tp576p576.html
Sent from the R devel mailing list archive at Nabble.com.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Conditional means and variances of a multivariate normal distribution

2012-03-04 Thread Ravi Varadhan
Hi,

My previous version of the conditional MVN function had a bug in that it would 
not work when conditional distribution was required for a single variable. I 
fixed this and also made a few minor changes. Here is the new version. 

condNormal - function(x.given, mu, sigma, given.ind, req.ind){
# Returns conditional mean and variance of x[req.ind] 
# Given x[given.ind] = x.given
# where X is multivariate Normal with
# mean = mu and covariance = sigma
# 
B - sigma[req.ind, req.ind]
C - sigma[req.ind, given.ind, drop=FALSE]
D - sigma[given.ind, given.ind]
CDinv - C %*% solve(D)
cMu - c(mu[req.ind] + CDinv %*% (x.given - mu[given.ind]))
cVar - B - CDinv %*% t(C)
list(condMean=cMu, condVar=cVar)
}

n - 10
A - matrix(rnorm(n^2), n, n)
A - A %*% t(A)
condNormal(x=c(1,1,0,0,-1), mu=rep(1,n), sigma=A, req=c(2,3,5), 
given=c(1,4,7,9,10))
condNormal(x=c(1,1,0,0,-1), mu=rep(1,n), sigma=A, req=2, given=c(1,4,7,9,10))

As far as I know, there is nothing related to multivariate normal distributions 
in stats.  Hence, it seems like this function might be more useful in a 
contributed package such as fMultivar or mvtnorm.

Best,
Ravi.

From: r-devel-boun...@r-project.org [r-devel-boun...@r-project.org] on behalf 
of Ravi Varadhan [rvarad...@jhmi.edu]
Sent: Sunday, March 04, 2012 10:32 AM
To: r-devel@r-project.org
Subject: [Rd] Conditional means and variances of a multivariate normal 
distribution

Hi,



Let X = (x_1, x_2, ... ,  x_p) be multivariate normal with mean, mu = (mu_1, 
... , mu_p) and covariance = Sigma.  I was looking for an R function to compute 
conditional mean and conditional variance of a given subset of X given another 
subset of X.  While this is trivially easy to do, there is nothing in base 
for doing this, at least nothing that I am aware of.  I am also not aware of 
anything in the contributed packages (although my search was not 
comprehensive).  I feel that this would be a useful addition, if it is not 
already there.  I have written this following function, which I am sure can be 
improved a lot (including better argument names!). I would like to hear your 
thought on this.



condNormal - function(x.given, mu, sigma, req.ind, given.ind){

# Returns conditional mean and variance of x[req.ind]

# Given x[given.ind] = x.given

# where X is multivariate Normal with

# mean = mu and covariance = sigma

#

B - sigma[req.ind, req.ind]

C - sigma[req.ind, given.ind]

D - sigma[given.ind, given.ind]

cMu - drop(mu[req.ind] + C %*% solve(D) %*% (x.given - mu[given.ind]))

cVar - B - C %*% solve(D) %*% t(C)

list(condMean=cMu, condVar=cVar)

}



n - 10

A - matrix(rnorm(n^2), n, n)

A - A %*% t(A)

condNormal(x=c(1,1,0,0,-1), mu=rep(1,n), sigma=A, req=c(2,3,5), 
given=c(1,4,7,9,10))



Best regards,

Ravi

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] rpart package, text function, and round of class counts

2012-03-04 Thread Achim Zeileis

On Sun, 4 Mar 2012, yindalon wrote:


I run the following code:

library(rpart)
data(kyphosis)
fit - rpart(Kyphosis ~ ., data=kyphosis)
plot(fit)
text(fit, use.n=TRUE)

The text labels represent the count of each class at the leaf node.
Unfortunately, the numbers are rounded and in scientific notation rather
than the exact number of examples sorted by that node in each class.


You probably have a getOption(digits) of 4 or lower. text.rpart uses 
getOption(digits) - 3 as the default which then means only 1 significant 
digit and hence it rounds and uses scientific notation. Using


text(fit, use.n = TRUE, digits = 3)

should do the trick. Maybe adding setting xpd = TRUE in addition helps in 
avoiding clipping of some labels.


Also, I would recommend to use

library(partykit)
plot(as.party(fit))

for visualization which uses a display like for the ctree() function (also 
mentioned on the web page you quote below).



The plot is supposed to look like
http://www.statmethods.net/advstats/images/ctree.png as per
http://www.statmethods.net/advstats/cart.html.

I'm running 2.14.1 on a mac.

Can anyone verify or point out if I am doing something obviously wrong for
displaying the counts rounded and in scientific notation rather than the
true counts in each class at each node?
Thanks.

--
View this message in context: 
http://r.789695.n4.nabble.com/rpart-package-text-function-and-round-of-class-counts-tp576p576.html
Sent from the R devel mailing list archive at Nabble.com.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel