Re: [R] how to randomly eliminate half the entries in a vector?

2009-02-17 Thread Gene Leynes
This is my first help post, hope it works!

Just check out the "sample" function
At the command line type:
?sample

I think it will be pretty clear from the documentation.

On Tue, Feb 17, 2009 at 9:13 PM, Esmail Bonakdarian wrote:

> (sorry if this is a duplicate-problems with posting at my end)
> 
> Hello all,
>
> I need some help with a nice R-idiomatic and efficient solution to a
> small problem.
>
> Essentially, I am trying to eliminate randomly half of the entries in
> a vector that contains index values into some other vectors.
>
> More details:
>
> I am working with two strings/vectors of 0s and 1s. These will contain
> about 200 elements (always the same number for both)
>
> I want to:
>
> 1. determines the locations of where the two strings differ
>
>--> easy using xor(s1, s2)
>
> 2. *randomly* selects *half* of those positions
>
>--> not sure how to do this. I suppose the result would be
>a list of index positions of size sum(xor(s1, s2))/2
>
> 3. exchange (flip) the bits in those random positions for both strings
>
>--> I have something that seems to do that, but it doesn't look
>slick and I wonder how efficient it is.
>
> Mostly I need help for #2, but will happily accept suggestions for #3,
> or for that matter anything that looks odd.
>
> Below my partial solution .. the HUX function is what I am trying
> to finish if someone can point me in the right direction.
>
> Thanks
> Esmail
> --
>
> rm(list=ls())
>
> 
> # create a binary vector of size "len"
> #
> create_bin_Chromosome <- function(len)
> {
>   sample(0:1, len, replace=T)
> }
>
> 
> # HUX - half uniform crossover
> #
> # 1. determines the locations of where the two strings
> #differ (easy xor)
> #
> # 2. randomly selects half of those positions
> #
> # 3. exchanges (flips) the bits in those positions for
> #both
> #
> HUX <- function(b1, b2)
> {
>   # 1. find differing bits
>   r=xor(b1, b2)
>
>   # positions where bits differ
>   different = which(r==TRUE)
>
>   cat("\nhrp: ", different, "\n")
>   # 2. ??? how to do this best so that each time
>   #a different half subset is selected? I.e.,
>   #sum(r)/2 positions.
>
>   # 3. this flips *all* positions, should really only flip
>   #half of them (randomly selected half)
>   new_b1 = b1
>   new_b2 = b2
>
>   for(i in different)  # should contain half the entries (randomly)
>   {
> new_b1[i] = b2[i]
> new_b2[i] = b1[i]
>   }
>
>   result <- matrix(c(new_b1, new_b2), 2, LEN, byrow=T)
>   result
> }
>
> LEN = 5
> b1=create_bin_Chromosome(LEN)
> b2=create_bin_Chromosome(LEN)
>
> cat(b1, "\n")
> cat(b2, "\n")
>
> idx=HUX(b1, b2)
> cat("\n\n")
> cat(idx[1,], "\n")
> cat(idx[2,], "\n")
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] portable R editor

2009-03-03 Thread Gene Leynes
After (too much) research, I've settled on SciTE, which is an open source
editor.  I really wanted emacs to work, but the crazy keyboard shortcuts
were killing me.

For SciTE to work, it takes a little work.
First open the global options file and change "#import r" to "import r" to
enable R syntax highlighting.

You can stop now if you like, or you can compile directly in SciTE.
(Personally, I'm happy enough to copy and past in the code by hitting
control-c, alt-tab, then control-v.)

To compile in SciTE you have to open the r.Proerties file (available from
the "options" menu), and add the following lines:
if PLAT_WIN
command.go.*.R="C:\Program Files\R\R-2.8.1\bin\Rscript.exe" --no-save
"$(FileNameExt)"
command.go.*.R="C:\Program Files\R\R-2.8.1\bin\Rscript.exe" --save
--restore "$(FileNameExt)"
command.go.subsystem.*.R=0

(You can remove the comment "#"s from the one line to disable automatic
workspace saving)

This method will compile inside SciTE, not the GUI you have open!  It will
output graphs to your working directory

I've also done a heap of work trying to submit to the open GUI, ideally line
by line.  I found some code (
http://www.sciviews.org/_rgui/projects/TpR_1.0.2.zip) that I'm sure holds
the answer, but I don't know how to use it. If someone figures it out please
let me know.
Substituting this line in SciTE r.Properties does something, but doesn't
work:
command.go.*.R="C:\TpR communicator\TpR.exe" "$(FileNameExt)"


On Tue, Mar 3, 2009 at 4:28 PM, Erich Neuwirth
wrote:

> EditPad Pro is commercial, which makes it a nonchoice for recommending
> it to my students.
> I recently switched from tinn-R to Notepad++ with NppToR and am quite
> happy with it. tinn-R is quite good, but possibly the project is
> getting too ambitous now. It needs quite some fiddling with
> Rprofile.site to make it work.
>
>
>
>
>
> Thompson, David (MNR) wrote:
> > Werner,
> >
> > Another alternative is EditPad Pro .
> >
> > DaveT.
> > *
> > Silviculture Data Analyst
> > Ontario Forest Research Institute
> > Ontario Ministry of Natural Resources
> > david.john.thomp...@ontario.ca
> > http://ontario.ca/ofri
> > *
> >> -Original Message-
> >> From: Andrew Redd [mailto:ar...@stat.tamu.edu]
> >> Sent: March 2, 2009 10:29 PM
> >> To: Michael Bibo
> >> Cc: r-h...@stat.math.ethz.ch
> >> Subject: Re: [R] portable R editor
> >>
> >> Thanks for the plug on NppToR.  Yes it is portable, but a few of the
> >> features don't work. I have it on my plan to have a launcher for the
> >> portable apps menu.  Also the website for NppToR is now
> >> https://sourceforge.net/projects/npptor/
> >>
> >> On my personal website I also keep, semi-up-to-date a
> >> portableapps.comcompatible launcher for R.
> >> http://www.stat.tamu.edu/~aredd/site/?q=node/2.
>  There is a full
> >> installation and just the launcher.  If you download just the
> >> launcher you
> >> can install the latest version of R into the bin directory and It will
> >> work.  I think it is version 2.8.0 in the installer.  I'll
> >> update that soon.
> >>
> >> Andrew Redd
> >>
> >> On Mon, Mar 2, 2009 at 8:13 PM, Michael Bibo
> >>  >>> wrote:
> >>> Werner Wernersen  yahoo.de> writes:
> >>>
> 
>  Hi,
> 
>  I have been dreaming about a complete R environment on my
> >> USB stick for a
> >>> long time. Now I finally want to
>  realize it but what I am missing is a good, portable
> >> editor for R which
> >>> has
> >>> tabs and syntax highlighting, can
>  execute code, has bookmarks and a little project file
> >> management facility
> >>> pretty much like Tinn-R has
>  those. I like Tinn-R but it seems like there is only a
> >> very old version
> >>> of
> >>> Tinn-R which works standalone.
> >>>
> >>> Hi Werner,
> >>>
> >>> Three options:
> >>>
> >>> I have previously posted about using Emacs + ESS on a USB stick:
> >>> http://finzi.psych.upenn.edu/R/Rhelp02/archive/107419.html
> >>>
> >>> Tinn-R will work portably.  I have simply copied the installed Tinn-R
> >>> folder
> >>> from one machine to another on which I do not have
> >> administrator privileges
> >>> and therefore cannot do a normal install.
> >>>
> >>> Another option is Notepad++ (http://notepad-
> >>> plus.sourceforge.net/uk/site.htm).  There is even a portable version
> >>> available
> >>> fom www.portableapps.com.  Andrew Redd has created (and is
> >> continuing to
> >>> develop) NppToR
> >> (http://www.stat.tamu.edu/~aredd/site/?q=node/37
>  >> t.tamu.edu/%7Earedd/site/?q=node/37>)
> >>> which
> >>> provides syntax highlighting, code folding and code passing to R.
> >>>  Notepad++
> >>> has a tabbed interface as well as add-ons including a file explorer,
> >>> windows
> >>> manager and multiclipboard manager.  It also allows for the
> >> recording of
> >

Re: [R] Cross-validation -> lift curve

2009-03-13 Thread Gene Leynes
This may be somewhat useful, but I might have more later.
http://florence.acadiau.ca/collab/hugh_public/index.php?title=R:CheckBinFit

(the code below is copied from the URL above)

CheckBinFit <- function(y,phat,nq=20,new=T,...) {
if(is.factor(y)) y <- as.double(y)
y <- y-mean(y)
y[y>0] <- 1
y[y<=0] <- 0
quants <- quantile(phat,probs=(1:nq)/(nq+1))
names(quants) <- NULL
quants <- c(0,quants,1)
phatD <- rep(0,nq+1)
phatF <- rep(0,nq+1)
for(i in 1:(nq+1))
{
which <- ((phat<=quants[i+1])&(phat>quants[i]))
phatF[i] <- mean(phat[which])
phatD[i] <- mean(y[which])
}
if (new) plot(phatF,phatD,xlab="phat",ylab="data",
  main=paste('R^2=',cor(phatF,phatD)^2),...)
else points(phatF,phatD,...)
abline(0,1)
return(invisible(list(phat=phatF,data=phatD)))
}



On Thu, Mar 12, 2009 at 1:30 PM, Eric Siegel wrote:

> Hi all,
>
> I'd like to do cross-validation on lm and get the resulting lift
> curve/table
> (or, alternatively, the estimates on 100% of my data with which I can get
> lift).
>
> If such a thing doesn't exist, could it be derived using cv.lm, or would we
> need to start from scratch?
>
> Thanks!
>
> --
> Eric Siegel, Ph.D.
> President
> Prediction Impact, Inc.
>
> Predictive Analytics World Conference
> More info: www.predictiveanalyticsworld.com
> LinkedIn Group: www.linkedin.com/e/gis/1005097
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R equivalent to MATLAB's "whos" Command?

2009-03-20 Thread Gene Leynes
I have found the gdata library quite helpful:

library(gdata)
ll()
ll(dimensions=TRUE)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] constrOptim workaround for "L-BFGS-B" or Box Constraints

2009-03-25 Thread Gene Leynes
This is not so much a question as a contribution, but comments are welcome.
Comments:
1) thank you very much to Paul Smith in the post
https://stat.ethz.ch/pipermail/r-help/2008-March/157249.html
This is intended to build on that example with something more complex
than
a 2x2 set of constraints
2) "L-BFGS-B" does not appear to work in optimConst

Problem:
let's say you have a function
y= a + b1*x1 + b2*x2 + b3*x3 + b4*x4
and you want to minimize (where b and x are the matrices of the b's and x's)
error = (a + b %*% x - y) ^ 2
subject to the constraints
sum(b) = 1
0= ci

# STEP 1: CREATE DATA
x=matrix(rnorm(45),nrow=9)/100
y=matrix(rnorm(9),nrow=9)/100
thetas=matrix(1/5,nrow=5)

# STEP 2: DEFINE ERROR FUNCTION
fn=function(ws){
t(x) %*% y
sum((x %*% ws-y)^2)
}

# STEP 3: DEFINE CONSTRAINTS
a=rbind(c(0, 1, 1, 1, 1),c(0,-1,-1,-1,-1))
temp=cbind(0,rbind(diag(4),-diag(4)))
adj=c(1.001,.999)
a=rbind(a*adj,temp)
b=c(1,-1,rep(0,4),rep(-1,4))

# STEP 3: DEFINE INITIAL STARTING POINT
thetas=matrix(c(1/5,1/4,1/4,1/4,1/4),nrow=5)

# STEP 3 (ALT): DEFINE INITIAL STARTING POINT
#thetas=matrix(runif(5),nrow=5)
#thetas[2:5]=thetas[2:5]/sum(thetas[2:5])

# ((TESTING, make sure athttps://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] get optim results into a model object

2009-04-07 Thread Gene Leynes
Hello all, I have an optimization routine that is giving me good results,
but the results are not in the nice "model" format like "lm".  How can I get
optim results into a model so that I can use the clever 'fitted',
'residuals', and 'summary' functions?

Using optim is the only way that I was able to make a model that
1) sums the betas to 1,
2) constrains the betas to positive numbers less than 1
3) does not constrain alpha
(The constrOptim cousin wasn't very accurate, and was very slow.)

Here is an example of some code, the results of which I would like to get
into a model with the form
y ~ alpha + REALPAR * x
where 'REALPAR' is the "normalized" output at the very end

many thanks
Code Below

set.seed(121)
x1=.04
for (i in 1:14) x1[i+1]=x1[i]*(1+rnorm(1)*.008)+.00025
x2=.08
for (i in 1:14) x2[i+1]=x2[i]*(1+rnorm(1)*.03)-.0018
x3=.01
for (i in 1:14) x3[i+1]=x3[i]*(1+rnorm(1)*.15)-.0008

b=matrix(c(0.6,0.0,0.4))
x=matrix(cbind(x1,x2,x3),ncol=3)
y=x%*%b  # the 'real' y
yhat=y+runif(15)*.006# the observed y

plot(x=1:15,ylim=c(min(x1,x2,x3),max(x1,x2,x3)))
matlines(cbind(x,y,yhat))

# Add a constant to x (for alpha)
x=cbind(1,x)
# "normalization" fun to make the rest of the x's add up to 1
normalize=function(x)c(x[1], x[2:length(x)]/sum(x[2:length(x)]))
# objective function:
fn=function(ws){
ws=normalize(ws)
sum((x %*% ws - yhat)^2)}

llim = c(-Inf,rep(0,ncol(x)-1))  # alpha (col 1 of 'x') is -Inf to Inf
ulim = c( Inf,rep(1,ncol(x)-1)) # betas (cols 2:4 of 'x') are 0 to 1
th=matrix(c(0,rep(1/3,3)))
o = optim(th, fn, method="L-BFGS-B",lower=llim, upper=ulim)
o
REALPAR = normalize(o$par)
REALPAR

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sequences

2009-04-07 Thread Gene Leynes
Not sure what you're trying to accomplish, but I think the index values are
off.  the first element of s is 1, not 0

Here's something that works:
s<-rep(0,207)
s<-as.vector(s)
s[0]<-0
lambs=rep(rnorm(207)*1000)
for (i in 1:(length(lambs)-1)){
   s[i]<-s[i+1]-mean(lambs)
}




On Tue, Apr 7, 2009 at 7:13 AM, Melissa2k9 wrote:

>
> Hi,
>
> I am trying to make a sequence and am  using a for loop for this. I want to
> start off with an initial value ie S[0]=0 then use the loop to create other
> values. This is what I have so far but I just keep getting error messages.
>
> #To calculate the culmulative sums:
>
> s<-rep(0,207)#as this is the length of the
> vector I know I will have
> s<-as.vector(s)
> s[0]<-0
> for (i in 1:length(lambs))# where lambs is a vector of
> length 207 consisting of temperature
>values
>
>
> {
>s[i]<-s[i-1]-mean(lambs)
> }
>
> I continually get the error message:
>
> Error in s[i] <- s[i - 1] - mean(lambs) : replacement has length zero
>
>
> When I merely use s[i]<-i-mean(lambs) it works so there is obviously
> something wrong with the s[i-1] but i cant see what. All I want is for each
> S[i] to be the previous value for S - the mean!
> --
> View this message in context:
> http://www.nabble.com/Sequences-tp22927714p22927714.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] get optim results into a model object

2009-04-07 Thread Gene Leynes
Very nice trick!!! Thank you!

When you combine your trick with the "nls" function, you get the whole
thing:
#(I have also simplified the model since the first post)
###
#Original Method: uses optim, but doesn't create "model":
###
x=matrix(c(0.04, 0.08, 0.01, 0.0398, 0.081, 0.00915, 0.04057, 0.07944,
0.00994, 0.04137, 0.07949, 0.01132, 0.0421, 0.08273, 0.00947, 0.04237,
0.08058, 0.00969, 0.0425, 0.07952, 0.00919, 0.04305, 0.07717, 0.00908,
0.04319, 0.07576, 0.0061, 0.04298, 0.07557, 0.00539, 0.04287, 0.07244,
0.00542, 0.04318, 0.071, 0.00388, 0.04348, 0.07053, 0.00375, 0.04335,
0.07075, 0.00364, 0.04386, 0.07481, 0.00296),ncol=3,byrow=T)
x=cbind(1,x)

y=matrix(c(0.02847,0.02831,0.03255,0.02736,0.03289,0.02774,0.03192,0.03141,
0.02927,0.02648,0.02807,0.03047,0.03046,0.02541,0.02729))

plot(x=1:15,ylim=c(max(x[,-1]),min(x[,-1])))
matlines(cbind(x[,-1],y))

# "normalization" fun to make the rest of the x's add up to 1
normalize=function(x)c(x[1], x[2:length(x)]/sum(x[2:length(x)]))
# objective function:
fn=function(ws){
ws=normalize(ws)
sum((x %*% ws - yhat)^2)}

llim = c(-Inf,rep(0,ncol(x)-1))  # alpha is -Inf to Inf
ulim = c( Inf,rep(1,ncol(x)-1)) # betas are 0 to 1
th=matrix(c(0,rep(1/3,3)))
o = optim(th, fn, method="L-BFGS-B",lower=llim, upper=ulim)
NormPar = normalize(o$par)
cat('pars:', NormPar,'\n', 'tot:', sum(NormPar), '\n')

###
# Dr. Varadhan's Method to make betas sum to 1:
###
xnew=rbind(c(0, 1, 1, 1),x)
ynew=rbind(1,y)
ans <- lm(ynew ~ xnew - 1)
summary(ans)
sum(ans$resid^2)  # compare with the objective function value from
cat('pars:', ans$coef,'\n', 'tot:', sum(ans$coef[2:4]), '\n')

###
# Dr. Varadhan's Method WITH nls Method
###
xnew=rbind(c(0, 1, 1, 1),x)
ynew=rbind(1,y)
df=data.frame(cbind(ynew,xnew))
colnames(df)=c('y','x0','x1','x2','x3')
aformula = y ~ b0*x0 + b1*x1 + b2*x2 + b3*x3
ans2 <- nls(aformula,data=df,,algorithm='port',
start=list(b0=0,b1=1/3,b2=1/3,b3=1/3),
lower=c(-Inf,rep(0,3)),upper=c(Inf,rep(1,3)),trace=T)
summary(ans2)
sum(residuals(ans2)^2)  # compare with the objective function value from
cat('pars:', coef(ans2),'\n', 'tot:', sum(coef(ans2)[2:4]), '\n')




On Tue, Apr 7, 2009 at 12:37 PM, Ravi Varadhan  wrote:

> Hi Gene,
>
> Try the following approach using lm():
>
> set.seed(121)
> x1=.04
> for (i in 1:14) x1[i+1]=x1[i]*(1+rnorm(1)*.008)+.00025
> x2=.08
> for (i in 1:14) x2[i+1]=x2[i]*(1+rnorm(1)*.03)-.0018
> x3=.01
> for (i in 1:14) x3[i+1]=x3[i]*(1+rnorm(1)*.15)-.0008
>
> b=matrix(c(0.6,0.0,0.4))  # why don't you assume a value for intercept?
> x=matrix(cbind(x1,x2,x3),ncol=3)
>
> y=x%*%b  # the 'real' y
> yhat=y+runif(15)*.006# the observed y
> x=cbind(1,x)
>
> # here is the simple trick:  add a row to X matrix and an element to yhat
> vector to impose constraints
> #
> xadd <- c(0, 1, 1, 1)
> xnew <- rbind(xadd, x)
> ynew <- c(1, yhat)
>
>ans <- lm(ynew ~ xnew - 1)
>summary(ans)
>sum(ans$coef[2:4])
>sum(ans$resid^2)  # compare with the objective function value from
> your approach
> > sum(ans$resid^2)
> [1] 2.677474e-05
> >
>
> > o$value
> [1] 2.718646e-05
> >
>
> Note that the sum of squared residuals from lm() is smaller than your value
> from optim().  Although, this approach works well in your example, it does
> not guarantee that the coefficients are between 0 and 1.
>
> Ravi.
>
>
> 
> ---
>
> Ravi Varadhan, Ph.D.
>
> Assistant Professor, The Center on Aging and Health
>
> Division of Geriatric Medicine and Gerontology
>
> Johns Hopkins University
>
> Ph: (410) 502-2619
>
> Fax: (410) 614-9625
>
> Email: rvarad...@jhmi.edu
>
> Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html
>
>
>
>
> 
> 
>
>
> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> On
> Behalf Of Gene Leynes
> Sent: Tuesday, April 07, 2009 12:17 PM
> To: r-help@r-project.org
> Subject: [R] get

Re: [R] Constrained, multiple response statistics

2009-04-08 Thread Gene Leynes
This sounds very similar to what I've been working on, but I'm not sure
without an example.

My solution has been to use an optimization that normalizes inside the
objective function.  The betas that are provided by optim are not
normalized, however since they were normalized inside the objective
function, normalizing them after the fact mirrors the internal workings of
the objective function.

see this example:
http://markmail.org/message/ze5237m6gbgvvvyf

Still, after looking at several statistical packages, and considering the
thoughtful responses from my post, I think that there must be a better way
using existing models, so I've been looking at other packages / models.

On Tue, Apr 7, 2009 at 6:10 PM, Jonathan Greenberg wrote:

> R'ers:
>
>   I was hoping I could get some direction on this.  I have a dataset of the
> form:
>
> Y1,Y2,...,YM = f(X1,X2,...,XN), where N is >>> M
>
> The response data (Y1,Y2,...,YM) is frequency data, such that the sum of
> all Yi = 1.0.  Both Xj and Yi are continuous variables.
>
> I'm trying to figure out the best approach(es) to solving for the model f()
> -- any ideas?  I could solve each Y one at a time, but the lack of
> constraint worries me, and I'm pretty sure that normalizing the data
> afterwards to sum to 1.0 is not going to work out properly.  Thoughts?  I've
> never worked with multiple response statistics before, so I'm mostly trying
> to get some pointers on where to begin investigating...
>
> --j
>
> --
>
> Jonathan A. Greenberg, PhD
> Postdoctoral Scholar
> Center for Spatial Technologies and Remote Sensing (CSTARS)
> University of California, Davis
> One Shields Avenue
> The Barn, Room 250N
> Davis, CA 95616
> Cell: 415-794-5043
> AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] productivity tools in R?

2009-07-01 Thread Gene Leynes
I have recently discovered the "playwith" library, which is great for  
creating complex lattice objects.

If you start with a simple lattice plot then modify it using  
playwith, you can export the code to produce the spiffed up plot.

I noticed this function at the bottom of the xyplot documentation in  
zoo:
library/zoo/html/xyplot.zoo.html

# playwith (>= 0.8-55)
library("playwith")
z3 <- zoo(cbind(a = rnorm(100), b = rnorm(100) + 1), as.Date(1:100))
playwith(xyplot(z3), time.mode = TRUE)

On Jul 1, 2009, at 11:58 AM, Michael wrote:

> Hi all,
>
> Could anybody point me to some latest productivity tools in R? I am
> interested in speeding up my R programming and improving my efficiency
> in terms of debugging and developing R programs.
>
> I saw my friend has a R Console window which has automatic syntax
> reminder when he types in the first a few letters of R command. And
> he's using R under MAC. Is that a MAC thing, or I could do the same on
> my PC Windows?
>
> More pointers about using R for efficiency in development are highly
> apprecaited!
>
> Thanks a lot!
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting- 
> guide.html
> and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] productivity tools in R?

2009-07-02 Thread Gene Leynes
I have recently discovered the "playwith" library, which is great for
creating complex lattice objects.
If you start with a simple lattice plot then modify it using playwith, you
can export the code to produce the spiffed up plot.

I noticed this function at the bottom of the xyplot documentation in zoo:
library/zoo/html/xyplot.zoo.html

# playwith (>= 0.8-55)
library("playwith")
z3 <- zoo(cbind(a = rnorm(100), b = rnorm(100) + 1), as.Date(1:100))
playwith(xyplot(z3), time.mode = TRUE)


On Wed, Jul 1, 2009 at 11:58 AM, Michael  wrote:

> Hi all,
>
> Could anybody point me to some latest productivity tools in R? I am
> interested in speeding up my R programming and improving my efficiency
> in terms of debugging and developing R programs.
>
> I saw my friend has a R Console window which has automatic syntax
> reminder when he types in the first a few letters of R command. And
> he's using R under MAC. Is that a MAC thing, or I could do the same on
> my PC Windows?
>
> More pointers about using R for efficiency in development are highly
> apprecaited!
>
> Thanks a lot!
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to replace NAs in a vector of factors?

2009-07-21 Thread Gene Leynes
# Just when I thought I had the basic stuff mastered
# This has been quite perplexing, thanks for any help


## Here's the example:

db1=data.frame(
olditems=c('soup','','','','nuts'),
prices=c(4.45, 3.25, 4.42, 2.25, 3.98))
db2=data.frame(
newitems=c('stew','crackers','tofu','goatsmilk','peanuts'))

str(db1)#factors and prices
str(db2)#new names, but I want *only* the updates

is.na(db1$olditems)  #a little surprising that '' is not equal to NA
db1$olditems=='' #oh good, at least I can get to the blanks this way
db1$olditems[db1$olditems=='']  #wait, only one item is returned?
db1[db1$olditems=='',]  #somehow this works!

#how would I get the new item names into the old items column of db1??
# I was expecting that this would work:
#db1$olditems[db1$olditems=='']=
#db2$newitems[db1$olditems=='']

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] list of lm() results

2009-07-21 Thread Gene Leynes
I found that it was easiest to just pull out the parts I want with an
"apply" loop.

Here I am regressing a bunch of equity returns on some index returns and
just keeping the coefficients:

EqCoefQ1 = apply(retEqQ1,2,
function(x) summary(lm(x~retIndexQ1))$coefficients)

On Tue, Jul 21, 2009 at 4:49 PM, Giovanni Petris  wrote:

>
> My guess is that you did not define the list correcly before making
> any assignments to it. Try something like
>
> > myResults <- vector("list", 16)
>
> and then your code
>
> > myResults[1] <- lm(...)
> > myResults[2] <- lm(...)
> > myResults[3] <- lm(...)
>  ...
> > myResults[15] <- lm(...)
> > myResults[16] <- lm(...)
>
> HTH,
> Giovanni
>
> > Date: Tue, 21 Jul 2009 11:27:52 -0500
> > From: Idgarad 
> > Sender: r-help-boun...@r-project.org
> > Precedence: list
> > DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com;
> s=gamma;
> > DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
> >
> > How can I get the results of lm() into a list so I can loop through the
> results?
> >
> > e.g.
> >
> > myResults[1] <- lm(...)
> > myResults[2] <- lm(...)
> > myResults[3] <- lm(...)
> > ...
> > myResults[15] <- lm(...)
> > myResults[16] <- lm(...)
> >
> > so far every attempt I've tried doesn't work throwing a "number of
> > items to replace is not a multiple of replacement length" error or
> > simply not working.
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> >
>
> --
>
> Giovanni Petris  
> Associate Professor
> Department of Mathematical Sciences
> University of Arkansas - Fayetteville, AR 72701
> Ph: (479) 575-6324, 575-8630 (fax)
> http://definetti.uark.edu/~gpetris/
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to replace NAs in a vector of factors?

2009-07-22 Thread Gene Leynes
Thank you so much, I'm humbled by the response from such great authors and
scholars.  I thought I would share the final version that worked perfectly
in my illustrative example, as well as the real one.

My main confusion was this part:
> db1$olditems[db1$olditems=='']
[1]
Levels:  nuts soup
I thought it was only one item, but really it's all three.  Only the first
one labeled with "[1]"

A side note, I don't understand the motivation to use "within" when simple
subsetting works using $ or [
Maybe it's important if the data frame has a really long name?
# To me, this is easier to read:
db1$olditems = factor(db1$olditems))
# Than this
db1 <- within(db1, olditems <- factor(olditems))

Here is my code example working, thanks to the generous feedback:

#A little function I always have loaded:
# (which, incidentally, was inspired by "Modern Applied Statistics with S"
page 33)
factors=function(x)levels(x)[x]

# The data.frame option "stringsAsFactors=FALSE" would have been perfect to
use here,
# but in my real example I can't re-import the data
db1 <- data.frame(
   olditems = c('soup','','','','nuts'),
   prices = c(4.45, 3.25, 4.42, 2.25, 3.98))
db2 <- data.frame(
   newitems = c('stew','crackers','tofu','goatsmilk','peanuts'))

db1$olditems[db1$olditems==''] #it looks like only one item is
returned
length(db1$olditems[db1$olditems=='']) #but all three are actually returned

db1$olditems=factors(db1$olditems) #converts the factors to strings
db1$olditems[db1$olditems=='']=NA  #replaces blanks with NA

#Note: this only works when db2 is in same order as db1
db1$olditems[is.na(db1$olditems)]=
factors(db2$newitems[is.na(db1$olditems)])
db1$olditems=factor(db1$olditems)  #I like to use factors b/c they
inherently
   # give a count of unique values
db1$olditems   #Success!

On Tue, Jul 21, 2009 at 8:22 PM,  wrote:

> Couple of points:
>
> 1. if you are going to be replacing entries in factors with updated levels,
> it's probably easier if you start with your strings remaining as strings as
> they go into the data frames.  So here is how I would start your example
>
>
> db1 <- data.frame(
>olditems = c('soup','','','','nuts'),
> prices = c(4.45, 3.25, 4.42, 2.25, 3.98),
>stringsAsFactors = FALSE)
> db2 <- data.frame(
>newitems = c('stew','crackers','tofu','goatsmilk','peanuts'),
>stringsAsFactors = FALSE)
>
>
> 2. Strings with zero characters are still strings (like zero is still a
> number).  They are not missing.  If you want them to be made missing you can
> do so afterwards with:
>
>
>  zero length strings become NA
> is.na(db1$olditems[db1$olditems == '']) <- TRUE
>
>
> 3. Now to replace the missing values with the corresponding ones from the
> second data frame:
>
>
> k <- is.na(db1$olditems)
> db1[k, "olditems"] <- db2[k, "newitems"]
>
>
> 4. Check
>
> > db1
>   olditems prices
> 1  soup   4.45
> 2  crackers   3.25
> 3  tofu   4.42
> 4 goatsmilk   2.25
> 5  nuts   3.98
> >
>
> 5. If you really do want factors rather than character strings, you can now
> change back:
>
> db1 <- within(db1, olditems <- factor(olditems)) ## use <- here!
>
> 6. check the difference
>
> > str(db1)
> 'data.frame':   5 obs. of  2 variables:
>  $ olditems: Factor w/ 5 levels "crackers","goatsmilk",..: 4 1 5 2 3
>  $ prices  : num  4.45 3.25 4.42 2.25 3.98
> >
>
>
>
> Bill Venables
> http://www.cmis.csiro.au/bill.venables/
>
>
> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> On Behalf Of Gene Leynes
> Sent: Wednesday, 22 July 2009 10:39 AM
> To: r-help@r-project.org
> Subject: [R] How to replace NAs in a vector of factors?
>
> # Just when I thought I had the basic stuff mastered
> # This has been quite perplexing, thanks for any help
>
>
> ## Here's the example:
>
> db1=data.frame(
>olditems=c('soup','','','','nuts'),
>prices=c(4.45, 3.25, 4.42, 2.25, 3.98))
> db2=data.frame(
>newitems=c('stew','crackers','tofu','goatsmilk','peanuts'))
>
> str(db1)#factors and prices
> str(db2)#new names, but I want *only* the updates
>
> is.na(db1$olditems)  #a little

Re: [R] Example scripts for R Manual

2009-08-10 Thread Gene Leynes
Have you tried running the examples?
Eg:
example(lm)

On Monday, August 10, 2009, Peng Yu  wrote:
> Some examples in the manual are not in the context. In order to use
> such examples, the users have to set up the variables in the examples.
> Adding accompany scripts to the manuals can make the manuals more
> reader friendly.
>
> Regards,
> Peng
>
> On Mon, Aug 10, 2009 at 10:20 PM, Ronggui Huang 
> wrote:
>> Is it really necessary? You can just copy the commands in the manual
>> and paste them to R.
>>
>> Ronggui
>>
>> 2009/8/11 Peng Yu :
>>> Hi,
>>>
>>> I am wondering if some experienced users would help put the
>>> ready-to-run code of the examples in the manuals. It would help new
>>> users  learn R faster by putting all the examples in an ready-to-run R
>>> script file. Can somebody help do so sometime and post the code along
>>> with the pdf manuals?
>>>
>>> http://cran.r-project.org/manuals.html
>>>
>>> Regards,
>>> Peng
>>>
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>>
>> --
>> HUANG Ronggui, Wincent
>> PhD Candidate
>> Dept of Public and Social Administration
>> City University of Hong Kong
>> Home page: http://asrr.r-forge.r-project.org/rghuang.html
>>
>
> __
> r-h...@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] paste first row string onto every string in column

2009-08-13 Thread Gene Leynes
I'm also a newbie, but I've been getting loads of utility out of the "grep"
function and it's cousin "gsub".

Using asterisks are tricky because * often means "anything of any length" in
a search pattern (e.g. delete *.* means delete all your files!).  To find
the literal * using grep you would need to put some \'s in front of *, so
that R knows you mean the character * and not "anything of any length".
eg:
> txt=c('a','a*')
> txt
[1] "a"  "a*"
> grep('\\*',txt) # tell me, where is the star?
[1] 2
> gsub('\\*', ' success',txt)  #Replace the star with " success"
[1] "a""a success"

However, with the grep/gsub commands there are some other important symbols:
^Pattern occurs at beginning
$Pattern occurs at end
.Means "anything", but only once
+Preceeding character occurs more than once... so
.+   Means "anything, more than once"

So, to strip off everything up to the first *, I would try something like
this:
> txt=c('aa*very important','a*important')
> txt
[1] "aa*very important" "a*important"
> gsub('^.+\\*', 'success',txt)
[1] "successvery important" "successimportant"

On Wed, Aug 12, 2009 at 11:06 AM, Jill Hollenbach wrote:

>
> Thanks so much everybody, this has been incredibly helpful--not only is my
> immediate issue solved but I've learned a lot in the process. The lapply
> solution is best for me, as I need flexibility to edit df's with varying
> numbers of columns.
>
> Now, one more question: after appending the string from the first line, I
> am
> manipulating the df further(recoding the original contents; this I have
> working fine), and afterwards I will need to strip back off that string. It
> seems relatively straightforward, except that, as shown in the example
> above
> (df2), there is an astersik involved (I need to remove all characters up to
> and including the asterisk) which seems problematic.
> Any suggestions?
> Many thanks,
> Jill
>
>
>
> Don MacQueen wrote:
> >
> > Let's start with something simple and relatively easy to understand,
> > since you're new to this.
> >
> > First, here's an example of the core of the idea:
> >>  paste('a',1:4)
> > [1] "a 1" "a 2" "a 3" "a 4"
> >
> > Make it a little closer to your situation:
> >>  paste('a*',1:4, sep='')
> > [1] "a*1" "a*2" "a*3" "a*4"
> >
> > Sometimes it helps to save the number of rows in your dataframe in a
> > new variable
> >
> > nr <- nrow(df)
> >
> > Then, for your first column, the "a*" in the above example is df$V1[1]
> > For the 1:4 in the example, you use  df$V1[ 2:nr]
> > Put it together and you have:
> >
> > dfnew <- df
> > dfnew$V1[ 2:nr] <- paste( dfnew$V1[1], dfnew$V1[ 2:nr] )
> >
> > But you can use "-1" instead of "2:nr", and you get
> >
> >dfnew$V1[ -1 ] <- paste( dfnew$V1[1], dfnew$V1[ -1] )
> >
> > That's how you can do it one column at a time.
> > Since you have only four columns, just do the same thing to V2, V3, and
> > V4.
> >
> > But if you want a more general method, one that works no matter how
> > many columns you have, and no matter what they are named, then you
> > can use lapply() to loop over the columns. This is what Patrick
> > Connolly suggested, which is
> >
> > as.data.frame(lapply(df, function(x) paste(x[1], x[-1], sep = "")))
> >
> > Note, though, that this will do it to all columns, so if you ever
> > happen to have a dataframe where you don't want to do all columns,
> > you'll have to be a little trickier with the lapply() solution.
> >
> > -Don
> >
> > At 6:48 PM -0700 8/11/09, Jill Hollenbach wrote:
> >>Hi,
> >>I am trying to edit a data frame such that the string in the first line
> is
> >>appended onto the beginning of each element in the subsequent rows. The
> data
> >>looks like this:
> >>
> >>>  df
> >>   V1   V2   V3   V4
> >>1   DPA1* DPA1* DPB1* DPB1*
> >>2   0103 0104 0401 0601
> >>3   0103 0103 0301 0402
> >>.
> >>.
> >>  and what I want is this:
> >>
> >>>dfnew
> >>   V1   V2   V3   V4
> >>1   DPA1* DPA1* DPB1* DPB1*
> >>2   DPA1*0103 DPA1*0104 DPB1*0401 DPB1*0601
> >>3   DPA1*0103 DPA1*0103 DPB1*0301 DPB1*0402
> >>
> >>any help is much appreciated, I am new to this and struggling.
> >>Jill
> >>
> >>___
> >>  Jill Hollenbach, PhD, MPH
> >> Assistant Staff Scientist
> >> Center for Genetics
> >> Children's Hospital Oakland Research Institute
> >> jhollenb...@chori.org
> >>
> >>--
> >>View this message in context:
> >>http://*www.*
> nabble.com/paste-first-row-string-onto-every-string-in-column-tp24928720p24928720.html
> >>Sent from the R help mailing list archive at Nabble.com.
> >>
> >>__
> >>R-help@r-project.org mailing list
> >>https://*stat.ethz.ch/mailman/listinfo/r-help
> >>PLEASE do read the posting guide
> http://*www.*R-project.org/posting-guide.html
> >>and provide commented, minimal, self-contained, reproducible code.
> >
> >
> > --
> > --
> > Don MacQueen
> > Environmental Protection Department
> > Lawrence Livermore Natio

[R] where did ggplot go?

2009-09-04 Thread Gene Leynes
This must be explained somewhere, but I've been searching for a couple of
hours and not found it.

What happened to ggplot?  It appears to be missing on CRAN, except in the
archives.
http://cran.r-project.org/web/packages/ggplot/index.html

Has ggplot2 replaced ggplot?
I was trying to run some examples and found that "pscontinuous" and "ggline"
are not part of ggplot2.

Thanks,

Gene

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.