[R] Sweave trim output

2011-02-25 Thread Dimitris Rizopoulos

Dear All,

I'd like to trim the output produced in a Sweave code chunk. For 
instance, in


fit - lm(conc ~ . - Plant, data = CO2)
summary(fit)

I'd like, skip the info after the coefficients' table, and possibly 
replace it with '...'.


I've created this small function to do this, which is based on 
capture.output():


trim.output - function (x, lines, above = FALSE) {
if (above)
cat(\n...\n\n)
cat(paste(x[lines], collapse = \n))
cat(\n\n...\n)
}

out - capture.output(summary(fit))
trim.output(out, 1:13)


but I was wondering if there is an *official* way to do this.


Thanks in advance.

Best,
Dimitris

--
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014
Web: http://www.erasmusmc.nl/biostatistiek/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Variable names AS variable names?

2011-02-25 Thread David Winsemius


On Feb 25, 2011, at 1:55 AM, Noah Silverman wrote:


How can I dynamically use a variable as the name for another variable?

I realize this sounds cryptic, so an example is best:

#Start with an array of codes
codes - c(a1, b24, q99)


Is there some reason not to use list(a1, b24, q99)? If not then:

lapply(codes, somefun)




#Each code has a corresponding matrix (could be vector)
a1 - matrix(rnorm(100), nrow=10)
b24 - matrix(rnorm(100), nrow=10)
q99 - matrix(rnorm(100), nrow=10)

#Now, I want to loop through all the codes and do something with  
each matrix

for(code in codes){
   #here is where I'm stuck.  I don't want the value of code, but the
variable who's name is the value of code

}


Any suggestions?

-N

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating objects (data.frames) with names stored in character vector

2011-02-25 Thread David Winsemius


On Feb 24, 2011, at 3:38 PM, Kent Alleman wrote:


Hello,

I'm fairly new to R.  I'm a chemist, not a programmer so please bear  
with me.


I have a large data.frame that I want to break down (subset) into  
smaller data.frames for analysis.  I would like to give the  
data.frames descriptive names which I have stored in a character  
vector.  My original thought was that I want the subsets to show up  
as individual objects, but haveing them stored in a list is fine  
(maybe better).


I can create a list of subsetted data.frames like this:

Lst = list(subset1 = (subset (blablabla)), subset2 =  
(subset(blabla)))

but I have to provide the component names (subset1, subset2) manually.


lstnames - paste(subset, 1:2, sep=_)
names(Lst) - lstnames



I would like to pull the component names from an existing character  
vector, but so far my attempts have failed.


Any advice is appreciated, even if the advice is don't do that.

Thank you,

Kent

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Selecting data based on the range of dates

2011-02-25 Thread Belle

Hi:

I want to give an index with all the dates between Sept. to Nov. as 1, and
anything else is 0. It doesn't matter which year it is, as long as it is
between Sept. to Nov, then set up to 1, otherwise is 0.

My data frame looks like below:

ID  Date
201 1/1/05 6:07 AM
201 3/27/09 9:45 AM
201 9/29/09 8:44 AM
203 10/16/08 10:01 AM
203 10/28/08 9:45 AM
203 10/31/08 11:12 AM
203 11/7/08 11:32 AM
203 11/14/08 10:30 AM
203 11/19/08 10:40 AM
203 11/25/08 3:25 PM
203 12/4/08 10:48 AM
203 1/28/09 11:04 AM
203 2/12/09 3:15 PM
203 2/16/09 2:59 PM
203 2/24/09 2:45 PM
203 3/4/09 10:14 AM
203 3/27/09 11:36 AM
203 4/1/09 10:43 AM
203 4/16/09 2:28 PM
203 4/22/09 2:37 PM
203 4/29/09 10:48 AM
203 4/1/09 10:45 AM
203 12/3/09 9:07 AM
203 12/11/09 8:58 AM
203 1/7/10 8:53 AM

Thanks
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Selecting-data-based-on-the-range-of-dates-tp3323452p3323452.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Multivariate integration

2011-02-25 Thread shree sonal
Hello,

I came across a package called cubature (
http://cran.r-project.org/web/packages/cubature/index.html) to perform
multivariate integration. I was not able to understand few stuff:

What is the need for package flags under src/Makevars?
What is the purpose of fWrapper in the rcubature.c file?

Thank you
Sonal

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Selecting data based on the range of dates

2011-02-25 Thread Belle

I think I got it, I post it here see if you have better way, please let me
know.

index - rep(0, length(mydata[,1]))
index[as.Date(mydata3$Date)  as.Date(2006-11-30 23:29:29 PM) 
as.Date(mydata3$Date)  as.Date(2006-09-01 00:00:00 AM)] - 1
index[as.Date(mydata3$Date)  as.Date(2007-11-30 23:29:29 PM) 
as.Date(mydata3$Date)  as.Date(2007-09-01 00:00:00 AM)] - 1
index[as.Date(mydata3$Date)  as.Date(2008-11-30 23:29:29 PM) 
as.Date(mydata3$Date)  as.Date(2008-09-01 00:00:00 AM)] - 1
index[as.Date(mydata3$Date)  as.Date(2009-11-30 23:29:29 PM) 
as.Date(mydata3$Date)  as.Date(2009-09-01 00:00:00 AM)] - 1
index[as.Date(mydata3$Date)  as.Date(2010-11-30 23:29:29 PM) 
as.Date(mydata3$Date)  as.Date(2010-09-01 00:00:00 AM)] - 1

Thanks
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Selecting-data-based-on-the-range-of-dates-tp3323452p3323536.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] I have a Quick question about biometics

2011-02-25 Thread William Hammack
Hello,
 
I was searching online to find more info about Biometics
and I came across your information.
 
Can you tell me, are you still involved with Biometics? 
If you are, how are things going for you?
 
Please let me know.
 
Sincerely,
Will Hammack

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Compatibility with R for Windows 2.12.2

2011-02-25 Thread Vedajit Boyd
Hi,

Please someone let me know that the installation of both R for Windows
2.12.2 and MS office 2010 on the same system will interfere each other or
not.
In short, are these two tools compatible to each other?

Thanks in advance.

Best Regards,
Vedajit

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Variable names AS variable names?

2011-02-25 Thread Patrick Burns

One of my hypotheses of what you want is:

for(code in codes) {
get(code)
}

The other one is:

for(code in codes) {
as.name(code)
}


On 25/02/2011 06:55, Noah Silverman wrote:

How can I dynamically use a variable as the name for another variable?

I realize this sounds cryptic, so an example is best:

#Start with an array of codes
codes- c(a1, b24, q99)

#Each code has a corresponding matrix (could be vector)
a1- matrix(rnorm(100), nrow=10)
b24- matrix(rnorm(100), nrow=10)
q99- matrix(rnorm(100), nrow=10)

#Now, I want to loop through all the codes and do something with each matrix
for(code in codes){
 #here is where I'm stuck.  I don't want the value of code, but the
variable who's name is the value of code

}


Any suggestions?

-N

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of 'Some hints for the R beginner'
and 'The R Inferno')

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Compatibility with R for Windows 2.12.2

2011-02-25 Thread Prof Brian Ripley

On Fri, 25 Feb 2011, Vedajit Boyd wrote:


Hi,

Please someone let me know that the installation of both R for Windows
2.12.2 and MS office 2010 on the same system will interfere each other or
not.
In short, are these two tools compatible to each other?


There is nothing special about R, but you will have to ask Microsoft 
if their products cause problems for other (well written, 
standard-conformant) software.  Their software works at system level: 
R does not install anything at system level, not even registry entries 
(unless selected in an Administrator install).


R is widely used on systems with MS office 2007 installed, but that's 
no guarantee that some rarely used Office option on some version of 
Windows does not interfere with R.


NB: 'R for Windows 2.12.2' is future-ware.



Thanks in advance.

Best Regards,
Vedajit

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] GLM, how to get an R2 to explain how much of data explained by one variable

2011-02-25 Thread Clare Embling
Hi Celine,

GLM outputs usually give the null deviance and residual deviance in the 
summary() term - so you can work out % deviance explained for a variable/model 
from this.  Hope this helps.

Best wishes,
Clare


Dr Clare B Embling
Visiting Research Fellow
Marine Institute
University of Plymouth
Plymouth, UK.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Compatibility with R for Windows 2.12.2

2011-02-25 Thread Rob Tirrell
I can't think of a reason why they would...

Rob Tirrell


On Thu, Feb 24, 2011 at 23:56, Vedajit Boyd vedajit.b...@gmail.com wrote:

 Hi,

 Please someone let me know that the installation of both R for Windows
 2.12.2 and MS office 2010 on the same system will interfere each other or
 not.
 In short, are these two tools compatible to each other?

 Thanks in advance.

 Best Regards,
 Vedajit

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Substituting inside expression

2011-02-25 Thread zbynek.jano...@gmail.com

I am having following problem:
I´m constructing model for calculation of area of triangle.
I know sides a, b, and gamma angle.
I wish to calculate the area using heron´s formula:
S - sqrt(s*(s-a)*(s-b)*(s-c))
where
s - (a+b+c)/2
and c is calculated using law of cosines:
c - sqrt(a^2 + b^2 -2*a*b*cos(gamma))

since i am calculating a regression model, i need derivation of this
expression for area S.
something like (D(expression.S,c(a,b)))

To write it all into a single expression, it is too complicated, so i would
like to use some kind of substitution. however, if i try:

s.e - substitute(expression((a+b+c)/2), list(c =
expression(sqrt(a^2+b^2-2*a*b*cos(gamma),
I get
s.e
expression((a + b + expression(sqrt(a^2 + b^2 - 2 * a * b * cos(gamma/2)

which is not what I wanted

Can someone point me to the right direction?

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Substituting-inside-expression-tp3324092p3324092.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Selecting data based on the range of dates

2011-02-25 Thread Rob Tirrell
Try

my.date - strptime(20/2/06 11:16:16.683, %d/%m/%y %H:%M:%OS)

Then you can examine my.date$mon.
--
Robert Tirrell | r...@stanford.edu | (607) 437-6532
Program in Biomedical Informatics | Butte Lab | Stanford University



On Thu, Feb 24, 2011 at 14:12, Belle ping...@gmail.com wrote:


 I think I got it, I post it here see if you have better way, please let me
 know.

 index - rep(0, length(mydata[,1]))
 index[as.Date(mydata3$Date)  as.Date(2006-11-30 23:29:29 PM) 
 as.Date(mydata3$Date)  as.Date(2006-09-01 00:00:00 AM)] - 1
 index[as.Date(mydata3$Date)  as.Date(2007-11-30 23:29:29 PM) 
 as.Date(mydata3$Date)  as.Date(2007-09-01 00:00:00 AM)] - 1
 index[as.Date(mydata3$Date)  as.Date(2008-11-30 23:29:29 PM) 
 as.Date(mydata3$Date)  as.Date(2008-09-01 00:00:00 AM)] - 1
 index[as.Date(mydata3$Date)  as.Date(2009-11-30 23:29:29 PM) 
 as.Date(mydata3$Date)  as.Date(2009-09-01 00:00:00 AM)] - 1
 index[as.Date(mydata3$Date)  as.Date(2010-11-30 23:29:29 PM) 
 as.Date(mydata3$Date)  as.Date(2010-09-01 00:00:00 AM)] - 1

 Thanks
 --
 View this message in context:
 http://r.789695.n4.nabble.com/Selecting-data-based-on-the-range-of-dates-tp3323452p3323536.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] speed up process

2011-02-25 Thread Ivan Calandra

Dear users,

I have a double for loop that does exactly what I want, but is quite 
slow. It is not so much with this simplified example, but IRL it is slow.

Can anyone help me improve it?

The data and code for foo_reg() are available at the end of the email; I 
preferred going directly into the problematic part.
Here is the code (I tried to simplify it but I cannot do it too much or 
else it wouldn't represent my problem). It might also look too complex 
for what it is intended to do, but my colleagues who are also supposed 
to use it don't know much about R. So I wrote it so that they don't have 
to modify the critical parts to run the script for their needs.


#column indexes for function
ind.xvar - 2
seq.yvar - 3:4
#position vector for legend(), stupid positioning but it doesn't matter here
mypos - c(topleft, topright,bottomleft)

#run the function for columns 34 as y (seq.yvar) with column 2 as x 
(ind.xvar) for all 3 datasets (mydata_list)

par(mfrow=c(2,1))
for (i in seq_along(seq.yvar)){
  k - seq.yvar[i]
  plot(mydata1[[k]]~mydata1[[ind.xvar]], type=p, 
xlab=names(mydata1)[ind.xvar], ylab=names(mydata1)[k])

  for (j in seq_along(mydata_list)){
foo_reg(dat=mydata_list[[j]], xvar=ind.xvar, yvar=k, mycol=j, 
pos=mypos[j], name.dat=names(mydata_list)[j])

  }
}

I tried with lapply() or mapply() but couldn't manage to pass the 
arguments for names() and col= correctly, e.g. for the 2nd loop:
lapply(mydata_list, FUN=function(x){foo_reg(dat=x, xvar=ind.xvar, 
yvar=k, col1=1:3, pos=mypos[1:3], name.dat=names(x)[1:3])})
mapply(FUN=function(x) {foo_reg(dat=x, name.dat=names(x)[1:3])}, 
mydata_list, col1=1:3, pos=mypos, MoreArgs=list(xvar=ind.xvar, yvar=k))


Thanks in advance for any hints.
Ivan




#create data (it looks horrible with these datasets but it doesn't 
matter here)
mydata1 - structure(list(species = structure(1:8, .Label = c(alsen, 
gogor, loalb, mafas, pacyn, patro, poabe, thgel), class = 
factor), fruit = c(0.52, 0.45, 0.43, 0.82, 0.35, 0.9, 0.68, 0), Asfc = 
c(207.463765, 138.5533755, 70.4391735, 160.9742745, 41.455809, 
119.155109, 26.241441, 148.337377), Tfv = c(47068.1437773483, 
43743.8087431582, 40323.5209129239, 23420.9455581495, 29382.6947428651, 
50460.2202192311, 21810.1456510625, 41747.6053810881)), .Names = 
c(species, fruit, Asfc, Tfv), row.names = c(NA, 8L), class = 
data.frame)


mydata2 - mydata1[!(mydata1$species %in% c(thgel,alsen)),]
mydata3 - mydata1[!(mydata1$species %in% c(thgel,alsen,poabe)),]
mydata_list - list(mydata1=mydata1, mydata2=mydata2, mydata3=mydata3)

#function for regression
library(WRS)
foo_reg - function(dat, xvar, yvar, mycol, pos, name.dat){
 tsts - tstsreg(dat[[xvar]], dat[[yvar]])
 tsts_inter - signif(tsts$coef[1], digits=3)
 tsts_slope - signif(tsts$coef[2], digits=3)
 abline(tsts$coef, lty=1, col=mycol)
 legend(x=pos, legend=c(paste(TSTS ,name.dat,: 
Y=,tsts_inter,+,tsts_slope,X,sep=)), lty=1, col=mycol)

}

--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Substituting inside expression

2011-02-25 Thread Ivan Calandra

Hi,

If I follow you correctly, you could write a function:

foo - function(a,b,gamma){
 c - sqrt(a^2 + b^2 -2*a*b*cos(gamma))
 s - (a+b+c)/2
 A - sqrt(s*(s-a)*(s-b)*(s-c))
 return(A)
}

I hope I didn't make mistakes, but it can still help you, I guess.
Ivan


Le 2/25/2011 10:11, zbynek.jano...@gmail.com a écrit :

I am having following problem:
I´m constructing model for calculation of area of triangle.
I know sides a, b, and gamma angle.
I wish to calculate the area using heron´s formula:
S- sqrt(s*(s-a)*(s-b)*(s-c))
where
s- (a+b+c)/2
and c is calculated using law of cosines:
c- sqrt(a^2 + b^2 -2*a*b*cos(gamma))

since i am calculating a regression model, i need derivation of this
expression for area S.
something like (D(expression.S,c(a,b)))

To write it all into a single expression, it is too complicated, so i would
like to use some kind of substitution. however, if i try:

s.e- substitute(expression((a+b+c)/2), list(c =
expression(sqrt(a^2+b^2-2*a*b*cos(gamma),
I get

s.e

expression((a + b + expression(sqrt(a^2 + b^2 - 2 * a * b * cos(gamma/2)

which is not what I wanted

Can someone point me to the right direction?



--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] speed up process

2011-02-25 Thread Nick Sabbe
Simply avoiding the for loops by using lapply (I may have missed a bracket
here or there cause I did this without opening R)...
Haven't checked the speed up, though.

lapply(seq.yvar, function(k){
   plot(mydata1[[k]]~mydata1[[ind.xvar]], type=p,
xlab=names(mydata1)[ind.xvar], ylab=names(mydata1)[k])
   lapply(seq_along(mydata_list), function(j){
 foo_reg(dat=mydata_list[[j]], xvar=ind.xvar, yvar=k, mycol=j,
pos=mypos[j], name.dat=names(mydata_list)[j])
 return(NULL)
   })
   invisible(NULL)
})

HTH,

Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove




-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Ivan Calandra
Sent: vrijdag 25 februari 2011 11:20
To: r-help
Subject: [R] speed up process

Dear users,

I have a double for loop that does exactly what I want, but is quite 
slow. It is not so much with this simplified example, but IRL it is slow.
Can anyone help me improve it?

The data and code for foo_reg() are available at the end of the email; I 
preferred going directly into the problematic part.
Here is the code (I tried to simplify it but I cannot do it too much or 
else it wouldn't represent my problem). It might also look too complex 
for what it is intended to do, but my colleagues who are also supposed 
to use it don't know much about R. So I wrote it so that they don't have 
to modify the critical parts to run the script for their needs.

#column indexes for function
ind.xvar - 2
seq.yvar - 3:4
#position vector for legend(), stupid positioning but it doesn't matter here
mypos - c(topleft, topright,bottomleft)

#run the function for columns 34 as y (seq.yvar) with column 2 as x 
(ind.xvar) for all 3 datasets (mydata_list)
par(mfrow=c(2,1))
for (i in seq_along(seq.yvar)){
   k - seq.yvar[i]
   plot(mydata1[[k]]~mydata1[[ind.xvar]], type=p, 
xlab=names(mydata1)[ind.xvar], ylab=names(mydata1)[k])
   for (j in seq_along(mydata_list)){
 foo_reg(dat=mydata_list[[j]], xvar=ind.xvar, yvar=k, mycol=j, 
pos=mypos[j], name.dat=names(mydata_list)[j])
   }
}

I tried with lapply() or mapply() but couldn't manage to pass the 
arguments for names() and col= correctly, e.g. for the 2nd loop:
lapply(mydata_list, FUN=function(x){foo_reg(dat=x, xvar=ind.xvar, 
yvar=k, col1=1:3, pos=mypos[1:3], name.dat=names(x)[1:3])})
mapply(FUN=function(x) {foo_reg(dat=x, name.dat=names(x)[1:3])}, 
mydata_list, col1=1:3, pos=mypos, MoreArgs=list(xvar=ind.xvar, yvar=k))

Thanks in advance for any hints.
Ivan




#create data (it looks horrible with these datasets but it doesn't 
matter here)
mydata1 - structure(list(species = structure(1:8, .Label = c(alsen, 
gogor, loalb, mafas, pacyn, patro, poabe, thgel), class = 
factor), fruit = c(0.52, 0.45, 0.43, 0.82, 0.35, 0.9, 0.68, 0), Asfc = 
c(207.463765, 138.5533755, 70.4391735, 160.9742745, 41.455809, 
119.155109, 26.241441, 148.337377), Tfv = c(47068.1437773483, 
43743.8087431582, 40323.5209129239, 23420.9455581495, 29382.6947428651, 
50460.2202192311, 21810.1456510625, 41747.6053810881)), .Names = 
c(species, fruit, Asfc, Tfv), row.names = c(NA, 8L), class = 
data.frame)

mydata2 - mydata1[!(mydata1$species %in% c(thgel,alsen)),]
mydata3 - mydata1[!(mydata1$species %in% c(thgel,alsen,poabe)),]
mydata_list - list(mydata1=mydata1, mydata2=mydata2, mydata3=mydata3)

#function for regression
library(WRS)
foo_reg - function(dat, xvar, yvar, mycol, pos, name.dat){
  tsts - tstsreg(dat[[xvar]], dat[[yvar]])
  tsts_inter - signif(tsts$coef[1], digits=3)
  tsts_slope - signif(tsts$coef[2], digits=3)
  abline(tsts$coef, lty=1, col=mycol)
  legend(x=pos, legend=c(paste(TSTS ,name.dat,: 
Y=,tsts_inter,+,tsts_slope,X,sep=)), lty=1, col=mycol)
}

-- 
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Substituting inside expression

2011-02-25 Thread Gabor Grothendieck
On Fri, Feb 25, 2011 at 4:11 AM, zbynek.jano...@gmail.com
zbynek.jano...@centrum.cz wrote:

 I am having following problem:
 I´m constructing model for calculation of area of triangle.
 I know sides a, b, and gamma angle.
 I wish to calculate the area using heron´s formula:
 S - sqrt(s*(s-a)*(s-b)*(s-c))
 where
 s - (a+b+c)/2
 and c is calculated using law of cosines:
 c - sqrt(a^2 + b^2 -2*a*b*cos(gamma))

 since i am calculating a regression model, i need derivation of this
 expression for area S.
 something like (D(expression.S,c(a,b)))

 To write it all into a single expression, it is too complicated, so i would
 like to use some kind of substitution. however, if i try:

 s.e - substitute(expression((a+b+c)/2), list(c =
 expression(sqrt(a^2+b^2-2*a*b*cos(gamma),
 I get
s.e
 expression((a + b + expression(sqrt(a^2 + b^2 - 2 * a * b * cos(gamma/2)

 which is not what I wanted

 Can someone point me to the right direction?

Try this:

 e - substitute((a+b+c)/2, list(c =  quote(sqrt(a^2+b^2-2*a*b*cos(gamma)
 D(e, a)
(1 + 0.5 * ((2 * a - 2 * b * cos(gamma)) * (a^2 + b^2 - 2 * a * b *
cos(gamma))^-0.5))/2

Also

 library(Ryacas)  # http://ryacas.googlecode.com

 a - Sym(a); b - Sym(b); gamma - Sym(gamma)

 c - sqrt(a^2+b^2-2*a*b*cos(gamma))
 deriv((a+b+c)/2, a)
expression(2 * ((2 * a - 2 * b * cos(gamma))/(2 * root(a^2 +
b^2 - 2 * a * b * cos(gamma), 2)) + 1)/4)



-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] scatter graph of more than one value

2011-02-25 Thread amir
Hi,

I have two X1,X2 and Y1,Y2 and I want to draw them ((X1,Y1), (X2,Y2)) in
a scatter graph.

How can I draw both of them in a same graph with different legends?
And is there any way to show different labels on each point?

Regards,
Amir

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] color code in loop for piecharts plotting

2011-02-25 Thread Lucia Rueda

Hi,
I am using this loop

par(mfrow=c(3,3))  
annos-c(2001:2007,2009)
for (i in annos) {  
t-subset(masia,YEAR==i)
t$FAMILIA-drop.levels(t$FAMILIA)
pie(table(t$FAMILIA),main=i)
  }  

To make piecharts of species composition among years (my data frame is
called masia). So I get 1 piechart of the families that we have found in
our survey each year. We don't have always the same families every year so I
added  t$FAMILIA-drop.levels(t$FAMILIA) 
to the loop to avoid having those family levels that aren't there in some
specific years in the pie

The problem is that the color code changes and I get for example different
colors for the same families in different years.

If I group those families that I have less individuals together in a
category called others and I make a new column called familia2 with
fewer levels so that every year I have all levels of familia2 in my species
composition I don't get the problem and all families have the same color
among years.

Does anybody know how to avoid the color code change for the families in the
loop. I know I can do it manually and give each family a color but I have
quite a lot of families so I'm wondering if there's any way to fix that some
other way.

I don't know if I made myself clear...

Thanks!

Lucia
-- 
View this message in context: 
http://r.789695.n4.nabble.com/color-code-in-loop-for-piecharts-plotting-tp3324196p3324196.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] scatter graph of more than one value

2011-02-25 Thread Ivan Calandra

Hi,

Take a look at ?points, ?legend and ?par (specifically col and pch)

HTH,
Ivan

Le 2/25/2011 11:58, amir a écrit :

Hi,

I have two X1,X2 and Y1,Y2 and I want to draw them ((X1,Y1), (X2,Y2)) in
a scatter graph.

How can I draw both of them in a same graph with different legends?
And is there any way to show different labels on each point?

Regards,
Amir

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] time series with NA - acf - tsdiag - Ljung-Box

2011-02-25 Thread Cecilia Reyna
Hi all,
I am modelling a time series with missing data.

*Q1)* However, I am not sure if I should use the next *graphics* to
understand my data:
*a)* ACF  PACF (original series)
*b)* ACF  PACF (residuals)
* *

*Q2)* I am using *tsdiag*, so I obtain a graphic with 3 plots: stand.
residuals vs time; acf for residuals; Ljung-Box for residuals (it is wrong
for residuals).
I know that using Box.test with type Ljung-Box, I can specify a correct df
to my estimated model (fitdf = p + q). So, I could do this test with
different lags, evaluate their significance, and then plot it. However, in
Box.test NA are not handled.
But, it is possible to do a Ljung-Box test with missing data [Stoffer 
Toloi, 1992. A note on the Ljung-Box-Pierce pormanteau statistic with
missing data].
*a)* Do you know any function to do a Ljung-Box test with NA?

*Q3) *In general, what (other?) tools do you recommend to use for time
series with missing data?

I had been using auto.arima and arima functions.
I don't want to do an interpolation.


Thanks in advance,
Cecilia

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] speed up process

2011-02-25 Thread Ivan Calandra

Thanks Nick for your quick answer.
It does work (no missed bracket!) but unfortunately doesn't really speed 
up anything: with my real data, it takes 82.78 seconds with the double 
lapply() instead of 83.59s with the double loop (about 0.8 s).


It looks like my double loop was not that bad. Does anyone know another 
faster way to do this?


Thanks again in advance,
Ivan

Le 2/25/2011 11:41, Nick Sabbe a écrit :

Simply avoiding the for loops by using lapply (I may have missed a bracket
here or there cause I did this without opening R)...
Haven't checked the speed up, though.

lapply(seq.yvar, function(k){
plot(mydata1[[k]]~mydata1[[ind.xvar]], type=p,
xlab=names(mydata1)[ind.xvar], ylab=names(mydata1)[k])
lapply(seq_along(mydata_list), function(j){
  foo_reg(dat=mydata_list[[j]], xvar=ind.xvar, yvar=k, mycol=j,
pos=mypos[j], name.dat=names(mydata_list)[j])
  return(NULL)
})
invisible(NULL)
})

HTH,

Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove




-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Ivan Calandra
Sent: vrijdag 25 februari 2011 11:20
To: r-help
Subject: [R] speed up process

Dear users,

I have a double for loop that does exactly what I want, but is quite
slow. It is not so much with this simplified example, but IRL it is slow.
Can anyone help me improve it?

The data and code for foo_reg() are available at the end of the email; I
preferred going directly into the problematic part.
Here is the code (I tried to simplify it but I cannot do it too much or
else it wouldn't represent my problem). It might also look too complex
for what it is intended to do, but my colleagues who are also supposed
to use it don't know much about R. So I wrote it so that they don't have
to modify the critical parts to run the script for their needs.

#column indexes for function
ind.xvar- 2
seq.yvar- 3:4
#position vector for legend(), stupid positioning but it doesn't matter here
mypos- c(topleft, topright,bottomleft)

#run the function for columns 34 as y (seq.yvar) with column 2 as x
(ind.xvar) for all 3 datasets (mydata_list)
par(mfrow=c(2,1))
for (i in seq_along(seq.yvar)){
k- seq.yvar[i]
plot(mydata1[[k]]~mydata1[[ind.xvar]], type=p,
xlab=names(mydata1)[ind.xvar], ylab=names(mydata1)[k])
for (j in seq_along(mydata_list)){
  foo_reg(dat=mydata_list[[j]], xvar=ind.xvar, yvar=k, mycol=j,
pos=mypos[j], name.dat=names(mydata_list)[j])
}
}

I tried with lapply() or mapply() but couldn't manage to pass the
arguments for names() and col= correctly, e.g. for the 2nd loop:
lapply(mydata_list, FUN=function(x){foo_reg(dat=x, xvar=ind.xvar,
yvar=k, col1=1:3, pos=mypos[1:3], name.dat=names(x)[1:3])})
mapply(FUN=function(x) {foo_reg(dat=x, name.dat=names(x)[1:3])},
mydata_list, col1=1:3, pos=mypos, MoreArgs=list(xvar=ind.xvar, yvar=k))

Thanks in advance for any hints.
Ivan




#create data (it looks horrible with these datasets but it doesn't
matter here)
mydata1- structure(list(species = structure(1:8, .Label = c(alsen,
gogor, loalb, mafas, pacyn, patro, poabe, thgel), class =
factor), fruit = c(0.52, 0.45, 0.43, 0.82, 0.35, 0.9, 0.68, 0), Asfc =
c(207.463765, 138.5533755, 70.4391735, 160.9742745, 41.455809,
119.155109, 26.241441, 148.337377), Tfv = c(47068.1437773483,
43743.8087431582, 40323.5209129239, 23420.9455581495, 29382.6947428651,
50460.2202192311, 21810.1456510625, 41747.6053810881)), .Names =
c(species, fruit, Asfc, Tfv), row.names = c(NA, 8L), class =
data.frame)

mydata2- mydata1[!(mydata1$species %in% c(thgel,alsen)),]
mydata3- mydata1[!(mydata1$species %in% c(thgel,alsen,poabe)),]
mydata_list- list(mydata1=mydata1, mydata2=mydata2, mydata3=mydata3)

#function for regression
library(WRS)
foo_reg- function(dat, xvar, yvar, mycol, pos, name.dat){
   tsts- tstsreg(dat[[xvar]], dat[[yvar]])
   tsts_inter- signif(tsts$coef[1], digits=3)
   tsts_slope- signif(tsts$coef[2], digits=3)
   abline(tsts$coef, lty=1, col=mycol)
   legend(x=pos, legend=c(paste(TSTS ,name.dat,:
Y=,tsts_inter,+,tsts_slope,X,sep=)), lty=1, col=mycol)
}



--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] speed up process

2011-02-25 Thread Jim Holtman
use Rprof to find where time is being spent.  probably in 'plot' which might 
imply it is not the 'for' loop and therefore beyond your control.

Sent from my iPad

On Feb 25, 2011, at 6:19, Ivan Calandra ivan.calan...@uni-hamburg.de wrote:

 Thanks Nick for your quick answer.
 It does work (no missed bracket!) but unfortunately doesn't really speed up 
 anything: with my real data, it takes 82.78 seconds with the double lapply() 
 instead of 83.59s with the double loop (about 0.8 s).
 
 It looks like my double loop was not that bad. Does anyone know another 
 faster way to do this?
 
 Thanks again in advance,
 Ivan
 
 Le 2/25/2011 11:41, Nick Sabbe a écrit :
 Simply avoiding the for loops by using lapply (I may have missed a bracket
 here or there cause I did this without opening R)...
 Haven't checked the speed up, though.
 
 lapply(seq.yvar, function(k){
plot(mydata1[[k]]~mydata1[[ind.xvar]], type=p,
 xlab=names(mydata1)[ind.xvar], ylab=names(mydata1)[k])
lapply(seq_along(mydata_list), function(j){
  foo_reg(dat=mydata_list[[j]], xvar=ind.xvar, yvar=k, mycol=j,
 pos=mypos[j], name.dat=names(mydata_list)[j])
  return(NULL)
})
invisible(NULL)
 })
 
 HTH,
 
 Nick Sabbe
 --
 ping: nick.sa...@ugent.be
 link: http://biomath.ugent.be
 wink: A1.056, Coupure Links 653, 9000 Gent
 ring: 09/264.59.36
 
 -- Do Not Disapprove
 
 
 
 
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
 Behalf Of Ivan Calandra
 Sent: vrijdag 25 februari 2011 11:20
 To: r-help
 Subject: [R] speed up process
 
 Dear users,
 
 I have a double for loop that does exactly what I want, but is quite
 slow. It is not so much with this simplified example, but IRL it is slow.
 Can anyone help me improve it?
 
 The data and code for foo_reg() are available at the end of the email; I
 preferred going directly into the problematic part.
 Here is the code (I tried to simplify it but I cannot do it too much or
 else it wouldn't represent my problem). It might also look too complex
 for what it is intended to do, but my colleagues who are also supposed
 to use it don't know much about R. So I wrote it so that they don't have
 to modify the critical parts to run the script for their needs.
 
 #column indexes for function
 ind.xvar- 2
 seq.yvar- 3:4
 #position vector for legend(), stupid positioning but it doesn't matter here
 mypos- c(topleft, topright,bottomleft)
 
 #run the function for columns 34 as y (seq.yvar) with column 2 as x
 (ind.xvar) for all 3 datasets (mydata_list)
 par(mfrow=c(2,1))
 for (i in seq_along(seq.yvar)){
k- seq.yvar[i]
plot(mydata1[[k]]~mydata1[[ind.xvar]], type=p,
 xlab=names(mydata1)[ind.xvar], ylab=names(mydata1)[k])
for (j in seq_along(mydata_list)){
  foo_reg(dat=mydata_list[[j]], xvar=ind.xvar, yvar=k, mycol=j,
 pos=mypos[j], name.dat=names(mydata_list)[j])
}
 }
 
 I tried with lapply() or mapply() but couldn't manage to pass the
 arguments for names() and col= correctly, e.g. for the 2nd loop:
 lapply(mydata_list, FUN=function(x){foo_reg(dat=x, xvar=ind.xvar,
 yvar=k, col1=1:3, pos=mypos[1:3], name.dat=names(x)[1:3])})
 mapply(FUN=function(x) {foo_reg(dat=x, name.dat=names(x)[1:3])},
 mydata_list, col1=1:3, pos=mypos, MoreArgs=list(xvar=ind.xvar, yvar=k))
 
 Thanks in advance for any hints.
 Ivan
 
 
 
 
 #create data (it looks horrible with these datasets but it doesn't
 matter here)
 mydata1- structure(list(species = structure(1:8, .Label = c(alsen,
 gogor, loalb, mafas, pacyn, patro, poabe, thgel), class =
 factor), fruit = c(0.52, 0.45, 0.43, 0.82, 0.35, 0.9, 0.68, 0), Asfc =
 c(207.463765, 138.5533755, 70.4391735, 160.9742745, 41.455809,
 119.155109, 26.241441, 148.337377), Tfv = c(47068.1437773483,
 43743.8087431582, 40323.5209129239, 23420.9455581495, 29382.6947428651,
 50460.2202192311, 21810.1456510625, 41747.6053810881)), .Names =
 c(species, fruit, Asfc, Tfv), row.names = c(NA, 8L), class =
 data.frame)
 
 mydata2- mydata1[!(mydata1$species %in% c(thgel,alsen)),]
 mydata3- mydata1[!(mydata1$species %in% c(thgel,alsen,poabe)),]
 mydata_list- list(mydata1=mydata1, mydata2=mydata2, mydata3=mydata3)
 
 #function for regression
 library(WRS)
 foo_reg- function(dat, xvar, yvar, mycol, pos, name.dat){
   tsts- tstsreg(dat[[xvar]], dat[[yvar]])
   tsts_inter- signif(tsts$coef[1], digits=3)
   tsts_slope- signif(tsts$coef[2], digits=3)
   abline(tsts$coef, lty=1, col=mycol)
   legend(x=pos, legend=c(paste(TSTS ,name.dat,:
 Y=,tsts_inter,+,tsts_slope,X,sep=)), lty=1, col=mycol)
 }
 
 
 -- 
 Ivan CALANDRA
 PhD Student
 University of Hamburg
 Biozentrum Grindel und Zoologisches Museum
 Abt. Säugetiere
 Martin-Luther-King-Platz 3
 D-20146 Hamburg, GERMANY
 +49(0)40 42838 6231
 ivan.calan...@uni-hamburg.de
 
 **
 http://www.for771.uni-bonn.de
 http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php
 
 __
 R-help@r-project.org mailing list
 

Re: [R] color code in loop for piecharts plotting

2011-02-25 Thread Jim Lemon

On 02/25/2011 09:33 PM, Lucia Rueda wrote:


Hi,
I am using this loop

par(mfrow=c(3,3))
annos-c(2001:2007,2009)
for (i in annos) {
t-subset(masia,YEAR==i)
t$FAMILIA-drop.levels(t$FAMILIA)
pie(table(t$FAMILIA),main=i)
   }

To make piecharts of species composition among years (my data frame is
called masia). So I get 1 piechart of the families that we have found in
our survey each year. We don't have always the same families every year so I
added  t$FAMILIA-drop.levels(t$FAMILIA)
to the loop to avoid having those family levels that aren't there in some
specific years in the pie

The problem is that the color code changes and I get for example different
colors for the same families in different years.

If I group those families that I have less individuals together in a
category called others and I make a new column called familia2 with
fewer levels so that every year I have all levels of familia2 in my species
composition I don't get the problem and all families have the same color
among years.

Does anybody know how to avoid the color code change for the families in the
loop. I know I can do it manually and give each family a color but I have
quite a lot of families so I'm wondering if there's any way to fix that some
other way.


Hi Lucia,
FAMILIA is probably a factor, therefore can be used as an index with 
as.numeric(). So if you have a vector of colors for all the families in 
your dataset, you could specify the color for each sector of the pie with:


# this gives you different colors for each family
family_colors-1:length(levels(t$FAMILIA))
for(i in annos) {
 t-subset(masia,YEAR==i)
 sector_index-as.numeric(unique(t$FAMILIA))
 pie(table(t$FAMILIA),main=i,col=family_colors[sector_index])
}

Can't try it at the moment, but it should be close.

Jim

[as.numeric(unique(t$FAMILIA[i]))] without dropping the levels (I think).

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lm - log(variable) - skip log(0)

2011-02-25 Thread agent dunham


I want to do a lm regression, some of the variables are going to be affected
with log, I would like not no take into account the values which imply doing
log(0) 

for just one variable I have done the following but it doesn't work: 

lmod1.lm - lm(log(dat$inaltu)~log(dat$indiam),subset=(!(dat$indiam %in%
c(0,1))) 

and obtain: 

Error en lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : 
  0 (non-NA) cases 

lmod1.lm - lm(log(dat$inaltu)~log(dat$indiam),subset=(!(dat$indiam = 0)),
na.action=na.exclude) 

and obtain 

Error en lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : 
  NA/NaN/Inf en llamada a una función externa (arg 1)

Thanks, u...@host.com
-- 
View this message in context: 
http://r.789695.n4.nabble.com/lm-log-variable-skip-log-0-tp3324263p3324263.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] group by in data.frame

2011-02-25 Thread zem

Hi all,

i have a little problem, and i think it is really simple to solve, but i
dont know exactly how to.
here is the challange: 
i have a data.frame with n colum, i have to group 2 of them and calculate
the mean value of the 3. one. so far so good, that was easy - i used
aggregate function to do this:
group-aggregate(x[,1],list(x[,2],x[,3]),mean) 
and now i have to copy the calculated mean value to every row of the
date.frame (in a new column in the dataframe), ofcourse by copying should be
the value  adequate to the group

it will be great if someone can help me
thanx in advance! 
-- 
View this message in context: 
http://r.789695.n4.nabble.com/group-by-in-data-frame-tp3324240p3324240.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] select element from vector

2011-02-25 Thread zem

Hi Jessica,

try this: Q[k:c(k+3)]
-- 
View this message in context: 
http://r.789695.n4.nabble.com/select-element-from-vector-tp3323725p3324286.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] speed up process

2011-02-25 Thread Ivan Calandra

Dear Jim,

I've tried to use Rprof() as you advised me, but I don't understand how 
it works.

I've done this:
Rprof(for (i in seq_along(seq.yvar)){
  all_my_commands
})
summaryRprof()

But I got this error:
Error in summaryRprof() : no lines found in ‘Rprof.out’

I couldn't really understand from the help page what I should do.

In any case, it's sure that the function tstsreg(), is what takes the 
most computing time. But I wanted to optimize the rest of the code to 
gain as much speed as possible.


Ivan

Le 2/25/2011 12:30, Jim Holtman a écrit :

use Rprof to find where time is being spent.  probably in 'plot' which might 
imply it is not the 'for' loop and therefore beyond your control.

Sent from my iPad

On Feb 25, 2011, at 6:19, Ivan Calandraivan.calan...@uni-hamburg.de  wrote:


Thanks Nick for your quick answer.
It does work (no missed bracket!) but unfortunately doesn't really speed up 
anything: with my real data, it takes 82.78 seconds with the double lapply() 
instead of 83.59s with the double loop (about 0.8 s).

It looks like my double loop was not that bad. Does anyone know another faster 
way to do this?

Thanks again in advance,
Ivan

Le 2/25/2011 11:41, Nick Sabbe a écrit :

Simply avoiding the for loops by using lapply (I may have missed a bracket
here or there cause I did this without opening R)...
Haven't checked the speed up, though.

lapply(seq.yvar, function(k){
plot(mydata1[[k]]~mydata1[[ind.xvar]], type=p,
xlab=names(mydata1)[ind.xvar], ylab=names(mydata1)[k])
lapply(seq_along(mydata_list), function(j){
  foo_reg(dat=mydata_list[[j]], xvar=ind.xvar, yvar=k, mycol=j,
pos=mypos[j], name.dat=names(mydata_list)[j])
  return(NULL)
})
invisible(NULL)
})

HTH,

Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove




-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Ivan Calandra
Sent: vrijdag 25 februari 2011 11:20
To: r-help
Subject: [R] speed up process

Dear users,

I have a double for loop that does exactly what I want, but is quite
slow. It is not so much with this simplified example, but IRL it is slow.
Can anyone help me improve it?

The data and code for foo_reg() are available at the end of the email; I
preferred going directly into the problematic part.
Here is the code (I tried to simplify it but I cannot do it too much or
else it wouldn't represent my problem). It might also look too complex
for what it is intended to do, but my colleagues who are also supposed
to use it don't know much about R. So I wrote it so that they don't have
to modify the critical parts to run the script for their needs.

#column indexes for function
ind.xvar- 2
seq.yvar- 3:4
#position vector for legend(), stupid positioning but it doesn't matter here
mypos- c(topleft, topright,bottomleft)

#run the function for columns 34 as y (seq.yvar) with column 2 as x
(ind.xvar) for all 3 datasets (mydata_list)
par(mfrow=c(2,1))
for (i in seq_along(seq.yvar)){
k- seq.yvar[i]
plot(mydata1[[k]]~mydata1[[ind.xvar]], type=p,
xlab=names(mydata1)[ind.xvar], ylab=names(mydata1)[k])
for (j in seq_along(mydata_list)){
  foo_reg(dat=mydata_list[[j]], xvar=ind.xvar, yvar=k, mycol=j,
pos=mypos[j], name.dat=names(mydata_list)[j])
}
}

I tried with lapply() or mapply() but couldn't manage to pass the
arguments for names() and col= correctly, e.g. for the 2nd loop:
lapply(mydata_list, FUN=function(x){foo_reg(dat=x, xvar=ind.xvar,
yvar=k, col1=1:3, pos=mypos[1:3], name.dat=names(x)[1:3])})
mapply(FUN=function(x) {foo_reg(dat=x, name.dat=names(x)[1:3])},
mydata_list, col1=1:3, pos=mypos, MoreArgs=list(xvar=ind.xvar, yvar=k))

Thanks in advance for any hints.
Ivan




#create data (it looks horrible with these datasets but it doesn't
matter here)
mydata1- structure(list(species = structure(1:8, .Label = c(alsen,
gogor, loalb, mafas, pacyn, patro, poabe, thgel), class =
factor), fruit = c(0.52, 0.45, 0.43, 0.82, 0.35, 0.9, 0.68, 0), Asfc =
c(207.463765, 138.5533755, 70.4391735, 160.9742745, 41.455809,
119.155109, 26.241441, 148.337377), Tfv = c(47068.1437773483,
43743.8087431582, 40323.5209129239, 23420.9455581495, 29382.6947428651,
50460.2202192311, 21810.1456510625, 41747.6053810881)), .Names =
c(species, fruit, Asfc, Tfv), row.names = c(NA, 8L), class =
data.frame)

mydata2- mydata1[!(mydata1$species %in% c(thgel,alsen)),]
mydata3- mydata1[!(mydata1$species %in% c(thgel,alsen,poabe)),]
mydata_list- list(mydata1=mydata1, mydata2=mydata2, mydata3=mydata3)

#function for regression
library(WRS)
foo_reg- function(dat, xvar, yvar, mycol, pos, name.dat){
   tsts- tstsreg(dat[[xvar]], dat[[yvar]])
   tsts_inter- signif(tsts$coef[1], digits=3)
   tsts_slope- signif(tsts$coef[2], digits=3)
   abline(tsts$coef, lty=1, col=mycol)
   legend(x=pos, legend=c(paste(TSTS ,name.dat,:
Y=,tsts_inter,+,tsts_slope,X,sep=)), lty=1, 

Re: [R] lm - log(variable) - skip log(0)

2011-02-25 Thread Liaw, Andy
You need to use == instead of = for testing equality.  While you're at it, 
you should check for positive values, not just screening out 0s.  This works 
for me:

R mydata = data.frame(x=0:10, y=runif(11))
R fm = lm(y ~ log(x), mydata, subset=x0)
 
Andy


 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of agent dunham
 Sent: Friday, February 25, 2011 6:24 AM
 To: r-help@r-project.org
 Subject: [R] lm - log(variable) - skip log(0)
 
 
 
 I want to do a lm regression, some of the variables are going 
 to be affected
 with log, I would like not no take into account the values 
 which imply doing
 log(0) 
 
 for just one variable I have done the following but it doesn't work: 
 
 lmod1.lm - 
 lm(log(dat$inaltu)~log(dat$indiam),subset=(!(dat$indiam %in%
 c(0,1))) 
 
 and obtain: 
 
 Error en lm.fit(x, y, offset = offset, singular.ok = 
 singular.ok, ...) : 
   0 (non-NA) cases 
 
 lmod1.lm - 
 lm(log(dat$inaltu)~log(dat$indiam),subset=(!(dat$indiam = 0)),
 na.action=na.exclude) 
 
 and obtain 
 
 Error en lm.fit(x, y, offset = offset, singular.ok = 
 singular.ok, ...) : 
   NA/NaN/Inf en llamada a una función externa (arg 1)
 
 Thanks, u...@host.com
 -- 
 View this message in context: 
 http://r.789695.n4.nabble.com/lm-log-variable-skip-log-0-tp332
4263p3324263.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
Notice:  This e-mail message, together with any attachme...{{dropped:11}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] group by in data.frame

2011-02-25 Thread Ivan Calandra

Hi,

I think ave() might do what you want:
df - data.frame(a=rep(c(this,that),5), b1=rnorm(10), b2=rnorm(10))
ave(df[,2], df[,1], FUN=mean)

For all columns, you could do that:
d - lapply(df[,2:3], FUN=function(x)ave(x,df[,1],FUN=mean))
df2 - cbind(df, d)

HTH,
Ivan

Le 2/25/2011 12:11, zem a écrit :

Hi all,

i have a little problem, and i think it is really simple to solve, but i
dont know exactly how to.
here is the challange:
i have a data.frame with n colum, i have to group 2 of them and calculate
the mean value of the 3. one. so far so good, that was easy - i used
aggregate function to do this:
group-aggregate(x[,1],list(x[,2],x[,3]),mean)
and now i have to copy the calculated mean value to every row of the
date.frame (in a new column in the dataframe), ofcourse by copying should be
the value  adequate to the group

it will be great if someone can help me
thanx in advance!


--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] count data

2011-02-25 Thread Sacha Viquerat

hello dear list! I wonder about the layout of my csv for my study design:

i have 11 different sites.

each site had been visited 9 times.

on each visit, 6 distinctive water parameters had been taken ONCE on 
each visit (as continuous variables).


on each visit, the fish abundance was counted using a net at 3 different 
locations within the site (count data).


I know i will have to do an lmer using the nested locations as error 
term. Question is: how to organize my data, since i have abundances from 
the same 3 locations per site replicate but only one water parameter 
measurement per site replicate. to give you an idea, heres the basic 
look so far of my csv:



sitelocationabundancepHno3and so on...
A1127.10.003...
A2157.10.003...
A3187.10.003...
B1117.40.004...
B287.40.004...
B3177.40.004...
A1137.20.001...
A2197.20.001...
A3217.20.001...
B196.90.002...
B256.90.002...
B326.90.002...

i just made up the table to give an idea how the data looks like. the 
goal would be to analyze fish abundance ~ water parameters, does anyone 
have a suggestion?


thanks in advance!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] limma function problem

2011-02-25 Thread Sukhbir Rattan
Hi,

I have two data set of normalized Affymetrix CEL files, wild type vs Control
type.(each set have further three replicates).


 wild.fish
AffyBatch object
size of arrays=712x712 features (10 kb)
cdf=Zebrafish (15617 affyids)
number of samples=3
number of genes=15617
annotation=zebrafish
notes=
 Dicer.fish
AffyBatch object
size of arrays=712x712 features (10 kb)
cdf=Zebrafish (15617 affyids)
number of samples=3
number of genes=15617
annotation=zebrafish
notes=

Now, I have to combine these two S4 objects and use lmFit function of Limma
package.I am able to combine the two S4 objects using merge function.


 merge.fish -merge(wild.fish,Dicer.fish)
 merge.fish
AffyBatch object
size of arrays=712x712 features (17833 kb)
cdf=Zebrafish (15617 affyids)
number of samples=6
number of genes=15617
annotation=zebrafish
notes=Merge from two AffyBatches with notes: 1)  , and 2)

 design
 Wild Mz_Dicer
GSM95623.CEL10
GSM95624.CEL10
GSM95625.CEL10
GSM95617.CEL01
GSM95618.CEL01
GSM95619.CEL01


 fit -lmFit(merge.fish, design)
Error in as.vector(data) :
  no method for coercing this S4 class to a vector

 mode(merge.fish)
[1] S4


So, how to troubleshoot this problem?


Regards,
Sukhbir Singh Rattan.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] count data

2011-02-25 Thread ONKELINX, Thierry
Dear Sacha,

Do you revisit the same locations per site? If so, use (1|site/location) as 
random effect. Otherwise use just (1|site). You might want to add a crossed 
random effect (1|date) if you can expect an effect of phenology.

Best regards,

Thierry

PS R-sig-mixed-models is a better list for this kind of questions.


ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek
team Biometrie  Kwaliteitszorg
Gaverstraat 4
9500 Geraardsbergen
Belgium

Research Institute for Nature and Forest
team Biometrics  Quality Assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium

tel. + 32 54/436 185
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more than 
asking him to perform a post-mortem examination: he may be able to say what the 
experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not ensure 
that a reasonable answer can be extracted from a given body of data.
~ John Tukey
  

 -Oorspronkelijk bericht-
 Van: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] Namens Sacha Viquerat
 Verzonden: vrijdag 25 februari 2011 13:16
 Aan: r-help
 Onderwerp: [R] count data
 
 hello dear list! I wonder about the layout of my csv for my 
 study design:
 
 i have 11 different sites.
 
 each site had been visited 9 times.
 
 on each visit, 6 distinctive water parameters had been taken 
 ONCE on each visit (as continuous variables).
 
 on each visit, the fish abundance was counted using a net at 
 3 different locations within the site (count data).
 
 I know i will have to do an lmer using the nested locations 
 as error term. Question is: how to organize my data, since i 
 have abundances from the same 3 locations per site replicate 
 but only one water parameter measurement per site replicate. 
 to give you an idea, heres the basic look so far of my csv:
 
 
 sitelocationabundancepHno3and so on...
 A1127.10.003...
 A2157.10.003...
 A3187.10.003...
 B1117.40.004...
 B287.40.004...
 B3177.40.004...
 A1137.20.001...
 A2197.20.001...
 A3217.20.001...
 B196.90.002...
 B256.90.002...
 B326.90.002...
 
 i just made up the table to give an idea how the data looks 
 like. the goal would be to analyze fish abundance ~ water 
 parameters, does anyone have a suggestion?
 
 thanks in advance!
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R 2.12.2 is released

2011-02-25 Thread Peter Dalgaard
I've rolled up R-2.12.2.tar.gz a short while ago. This is an update release, 
which fixes a number of mostly minor issues, and one major issue in which 
complex arithmetic was being messed up on some compiler platform.

You can get it from

http://cran.r-project.org/src/base/R-2/R-2.12.2.tar.gz

or wait for it to be mirrored at a CRAN site nearer to you.

Binaries for various platforms will appear in due course.

For the R Core Team

Peter Dalgaard

These are the md5sums for the freshly created files, in case you wish
to check that they are uncorrupted:

MD5 (AUTHORS) = ac9746b4845ae81f51cfc99262f5
MD5 (COPYING) = eb723b61539feef013de476e68b5c50a
MD5 (COPYING.LIB) = a6f89e2100d9b6cdffcea4f398e37343
MD5 (FAQ) = 72deeabefdf6fd14e83bf5703dce9176
MD5 (INSTALL) = 70447ae7f2c35233d3065b004aa4f331
MD5 (NEWS) = 30b55e4f34c155fcb2fafa7ebb55528e
MD5 (ONEWS) = 0c3e10eef74439786e5fceddd06dac71
MD5 (OONEWS) = b0d650eba25fc5664980528c147a20db
MD5 (R-latest.tar.gz) = bc70b51dddab8aa39066710624e55d5e
MD5 (README) = 296871fcf14f49787910c57b92655c76
MD5 (RESOURCES) = 020479f381d5f9038dcb18708997f5da
MD5 (THANKS) = f2ccf22f3e20ebaa86f8ee5cc6b0f655
MD5 (R-2/R-2.12.2.tar.gz) = bc70b51dddab8aa39066710624e55d5e

This is the relevant part of the NEWS file:

R News

CHANGES IN R VERSION 2.12.2:

  SIGNIFICANT USER-VISIBLE CHANGES:

• Complex arithmetic (notably z^n for complex z and integer n) gave
  incorrect results since R 2.10.0 on platforms without C99 complex
  support.  This and some lesser issues in trignometric functions
  have been corrected.

  Such platforms were rare (we know of Cygwin and FreeBSD).
  However, because of new compiler optimizations in the way complex
  arguments are handled, the same code was selected on x86_64 Linux
  with gcc 4.5.x at the default -O2 optimization (but not at -O).

• There is a workaround for crashes seen with several packages on
  systems using zlib 1.2.5: see the INSTALLATION section.

  NEW FEATURES:

• PCRE has been updated to 8.12 (two bug-fix releases since 8.10).

• rep(), seq(), seq.int() and seq_len() report more often when the
  first element is taken of an argument of incorrect length.

• The Cocoa back-end for the quartz() graphics device on Mac OS X
  provides a way to disable event loop processing temporarily
  (useful, e.g., for forked instances of R).

• kernel()'s default for m was not appropriate if coef was a set of
  coefficients.  (Reported by Pierre Chausse.)

• bug.report() has been updated for the current R bug tracker,
  which does not accept emailed submissions.

• R CMD check now checks for the correct use of $(LAPACK_LIBS) (as
  well as $(BLAS_LIBS)), since several CRAN recent submissions have
  ignored ‘Writing R Extensions’.

  INSTALLATION:

• The zlib sources in the distribution are now built with all
  symbols remapped: this is intended to avoid problems seen with
  packages such as XML and rggobi which link to zlib.so.1 on
  systems using zlib 1.2.5.

• The default for FFLAGS and FCFLAGS with gfortran on x86_64 Linux
  has been changed back to -g -O2: however, setting -g -O may still
  be needed for gfortran 4.3.x.

  PACKAGE INSTALLATION:

• A LazyDataCompression field in the DESCRIPTION file will be used
  to set the value for the --data-compress option of R CMD INSTALL.

• Files R/sysdata.rda of more than 1Mb are now stored in the
  lazyload daabase using xz compression: this for example halves
  the installed size of package Imap.

• R CMD INSTALL now ensures that directories installed from inst
  have search permission for everyone.

  It no longer installs files inst/doc/Rplots.ps and
  inst/doc/Rplots.pdf.  These are almost certainly left-overs from
  Sweave runs, and are often large.

  DEPRECATED  DEFUNCT:

• The ‘experimental’ alternative specification of a name space via
  .Export() etc is now deprecated.

• zip.file.extract() is now deprecated.

• Zip-ing data sets in packages (and hence R CMD INSTALL
  --use-zip-data and the ZipData: yes field in a DESCRIPTION file)
  is deprecated: using efficiently compressed .rda images and
  lazy-loading of data has superseded it.

  BUG FIXES:

• identical() could in rare cases generate a warning about
  non-pairlist attributes on CHARSXPs.  As these are used for
  internal purposes, the attribute check should be skipped.
  (Reported by Niels Richard Hansen).

• If the filename extension (usually .Rnw) was not included in a
  call to Sweave(), source references would not work properly and
  the keep.source option failed.  (PR#14459)

• format.data.frame() now keeps zero character column names.

• pretty(x) no longer raises an error when x contains solely
  non-finite values. (PR#14468)

• The plot.TukeyHSD() function now uses a line width of 0.5 for its
  

[R] Visualizing Points on a Sphere

2011-02-25 Thread Lorenzo Isella

Dear All,
I need to plot some points on the surface of a sphere, but I am not sure 
about how to proceed to achieve this in R (or if it is suitable for this 
at all).
In any case, I am not looking for really fancy visualizations; for 
instance you can consider the images between formulae 5 and 6 at


http://bit.ly/hOgK9h

Any suggestion is appreciated.
Cheers

Lorenzo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] speed up process

2011-02-25 Thread jim holtman
You invoke Rprof, run your code and then terminate it:


Rprof()
... code you want to profile
Rprof(NULL)  # generate output
summaryRprof()

example:


 Rprof()
 for (i in 1:1e6) sin(i) + cos(i) + sqrt(i)
 Rprof(NULL)
 summaryRprof()
$by.self
 self.time self.pct total.time total.pct
sin   0.2430.77   0.24 30.77
sqrt  0.2228.21   0.22 28.21
cos   0.1620.51   0.16 20.51
+ 0.1417.95   0.14 17.95
: 0.02 2.56   0.02  2.56

$by.total
 total.time total.pct self.time self.pct
sin0.24 30.77  0.2430.77
sqrt   0.22 28.21  0.2228.21
cos0.16 20.51  0.1620.51
+  0.14 17.95  0.1417.95
:  0.02  2.56  0.02 2.56

$sample.interval
[1] 0.02

$sampling.time
[1] 0.78


On Fri, Feb 25, 2011 at 6:57 AM, Ivan Calandra
ivan.calan...@uni-hamburg.de wrote:
 Dear Jim,

 I've tried to use Rprof() as you advised me, but I don't understand how it
 works.
 I've done this:
 Rprof(for (i in seq_along(seq.yvar)){
  all_my_commands
 })
 summaryRprof()

 But I got this error:
 Error in summaryRprof() : no lines found in ‘Rprof.out’

 I couldn't really understand from the help page what I should do.

 In any case, it's sure that the function tstsreg(), is what takes the most
 computing time. But I wanted to optimize the rest of the code to gain as
 much speed as possible.

 Ivan

 Le 2/25/2011 12:30, Jim Holtman a écrit :

 use Rprof to find where time is being spent.  probably in 'plot' which
 might imply it is not the 'for' loop and therefore beyond your control.

 Sent from my iPad

 On Feb 25, 2011, at 6:19, Ivan Calandraivan.calan...@uni-hamburg.de
  wrote:

 Thanks Nick for your quick answer.
 It does work (no missed bracket!) but unfortunately doesn't really speed
 up anything: with my real data, it takes 82.78 seconds with the double
 lapply() instead of 83.59s with the double loop (about 0.8 s).

 It looks like my double loop was not that bad. Does anyone know another
 faster way to do this?

 Thanks again in advance,
 Ivan

 Le 2/25/2011 11:41, Nick Sabbe a écrit :

 Simply avoiding the for loops by using lapply (I may have missed a
 bracket
 here or there cause I did this without opening R)...
 Haven't checked the speed up, though.

 lapply(seq.yvar, function(k){
    plot(mydata1[[k]]~mydata1[[ind.xvar]], type=p,
 xlab=names(mydata1)[ind.xvar], ylab=names(mydata1)[k])
    lapply(seq_along(mydata_list), function(j){
      foo_reg(dat=mydata_list[[j]], xvar=ind.xvar, yvar=k, mycol=j,
 pos=mypos[j], name.dat=names(mydata_list)[j])
      return(NULL)
    })
    invisible(NULL)
 })

 HTH,

 Nick Sabbe
 --
 ping: nick.sa...@ugent.be
 link: http://biomath.ugent.be
 wink: A1.056, Coupure Links 653, 9000 Gent
 ring: 09/264.59.36

 -- Do Not Disapprove




 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On
 Behalf Of Ivan Calandra
 Sent: vrijdag 25 februari 2011 11:20
 To: r-help
 Subject: [R] speed up process

 Dear users,

 I have a double for loop that does exactly what I want, but is quite
 slow. It is not so much with this simplified example, but IRL it is
 slow.
 Can anyone help me improve it?

 The data and code for foo_reg() are available at the end of the email; I
 preferred going directly into the problematic part.
 Here is the code (I tried to simplify it but I cannot do it too much or
 else it wouldn't represent my problem). It might also look too complex
 for what it is intended to do, but my colleagues who are also supposed
 to use it don't know much about R. So I wrote it so that they don't have
 to modify the critical parts to run the script for their needs.

 #column indexes for function
 ind.xvar- 2
 seq.yvar- 3:4
 #position vector for legend(), stupid positioning but it doesn't matter
 here
 mypos- c(topleft, topright,bottomleft)

 #run the function for columns 34 as y (seq.yvar) with column 2 as x
 (ind.xvar) for all 3 datasets (mydata_list)
 par(mfrow=c(2,1))
 for (i in seq_along(seq.yvar)){
    k- seq.yvar[i]
    plot(mydata1[[k]]~mydata1[[ind.xvar]], type=p,
 xlab=names(mydata1)[ind.xvar], ylab=names(mydata1)[k])
    for (j in seq_along(mydata_list)){
      foo_reg(dat=mydata_list[[j]], xvar=ind.xvar, yvar=k, mycol=j,
 pos=mypos[j], name.dat=names(mydata_list)[j])
    }
 }

 I tried with lapply() or mapply() but couldn't manage to pass the
 arguments for names() and col= correctly, e.g. for the 2nd loop:
 lapply(mydata_list, FUN=function(x){foo_reg(dat=x, xvar=ind.xvar,
 yvar=k, col1=1:3, pos=mypos[1:3], name.dat=names(x)[1:3])})
 mapply(FUN=function(x) {foo_reg(dat=x, name.dat=names(x)[1:3])},
 mydata_list, col1=1:3, pos=mypos, MoreArgs=list(xvar=ind.xvar, yvar=k))

 Thanks in advance for any hints.
 Ivan




 #create data (it looks horrible with these datasets but it doesn't
 matter here)
 mydata1- structure(list(species = structure(1:8, .Label = c(alsen,
 gogor, loalb, 

Re: [R] speed up process

2011-02-25 Thread Ivan Calandra

Ha... it was way too simple!
I thought it would be like system.time()... my bad. Thanks for the tip!

As we thought, foo_reg() takes most of the computing time, and I cannot 
improve that.

Any ideas of how to improve the rest?

Thanks again for your help
Ivan


Le 2/25/2011 14:29, jim holtman a écrit :

You invoke Rprof, run your code and then terminate it:


Rprof()
... code you want to profile
Rprof(NULL)  # generate output
summaryRprof()

example:



Rprof()
for (i in 1:1e6) sin(i) + cos(i) + sqrt(i)
Rprof(NULL)
summaryRprof()

$by.self
  self.time self.pct total.time total.pct
sin   0.2430.77   0.24 30.77
sqrt  0.2228.21   0.22 28.21
cos   0.1620.51   0.16 20.51
+ 0.1417.95   0.14 17.95
: 0.02 2.56   0.02  2.56

$by.total
  total.time total.pct self.time self.pct
sin0.24 30.77  0.2430.77
sqrt   0.22 28.21  0.2228.21
cos0.16 20.51  0.1620.51
+  0.14 17.95  0.1417.95
:  0.02  2.56  0.02 2.56

$sample.interval
[1] 0.02

$sampling.time
[1] 0.78


On Fri, Feb 25, 2011 at 6:57 AM, Ivan Calandra
ivan.calan...@uni-hamburg.de  wrote:

Dear Jim,

I've tried to use Rprof() as you advised me, but I don't understand how it
works.
I've done this:
Rprof(for (i in seq_along(seq.yvar)){
  all_my_commands
})
summaryRprof()

But I got this error:
Error in summaryRprof() : no lines found in ‘Rprof.out’

I couldn't really understand from the help page what I should do.

In any case, it's sure that the function tstsreg(), is what takes the most
computing time. But I wanted to optimize the rest of the code to gain as
much speed as possible.

Ivan

Le 2/25/2011 12:30, Jim Holtman a écrit :

use Rprof to find where time is being spent.  probably in 'plot' which
might imply it is not the 'for' loop and therefore beyond your control.

Sent from my iPad

On Feb 25, 2011, at 6:19, Ivan Calandraivan.calan...@uni-hamburg.de
  wrote:


Thanks Nick for your quick answer.
It does work (no missed bracket!) but unfortunately doesn't really speed
up anything: with my real data, it takes 82.78 seconds with the double
lapply() instead of 83.59s with the double loop (about 0.8 s).

It looks like my double loop was not that bad. Does anyone know another
faster way to do this?

Thanks again in advance,
Ivan

Le 2/25/2011 11:41, Nick Sabbe a écrit :

Simply avoiding the for loops by using lapply (I may have missed a
bracket
here or there cause I did this without opening R)...
Haven't checked the speed up, though.

lapply(seq.yvar, function(k){
plot(mydata1[[k]]~mydata1[[ind.xvar]], type=p,
xlab=names(mydata1)[ind.xvar], ylab=names(mydata1)[k])
lapply(seq_along(mydata_list), function(j){
  foo_reg(dat=mydata_list[[j]], xvar=ind.xvar, yvar=k, mycol=j,
pos=mypos[j], name.dat=names(mydata_list)[j])
  return(NULL)
})
invisible(NULL)
})

HTH,

Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove




-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On
Behalf Of Ivan Calandra
Sent: vrijdag 25 februari 2011 11:20
To: r-help
Subject: [R] speed up process

Dear users,

I have a double for loop that does exactly what I want, but is quite
slow. It is not so much with this simplified example, but IRL it is
slow.
Can anyone help me improve it?

The data and code for foo_reg() are available at the end of the email; I
preferred going directly into the problematic part.
Here is the code (I tried to simplify it but I cannot do it too much or
else it wouldn't represent my problem). It might also look too complex
for what it is intended to do, but my colleagues who are also supposed
to use it don't know much about R. So I wrote it so that they don't have
to modify the critical parts to run the script for their needs.

#column indexes for function
ind.xvar- 2
seq.yvar- 3:4
#position vector for legend(), stupid positioning but it doesn't matter
here
mypos- c(topleft, topright,bottomleft)

#run the function for columns 34 as y (seq.yvar) with column 2 as x
(ind.xvar) for all 3 datasets (mydata_list)
par(mfrow=c(2,1))
for (i in seq_along(seq.yvar)){
k- seq.yvar[i]
plot(mydata1[[k]]~mydata1[[ind.xvar]], type=p,
xlab=names(mydata1)[ind.xvar], ylab=names(mydata1)[k])
for (j in seq_along(mydata_list)){
  foo_reg(dat=mydata_list[[j]], xvar=ind.xvar, yvar=k, mycol=j,
pos=mypos[j], name.dat=names(mydata_list)[j])
}
}

I tried with lapply() or mapply() but couldn't manage to pass the
arguments for names() and col= correctly, e.g. for the 2nd loop:
lapply(mydata_list, FUN=function(x){foo_reg(dat=x, xvar=ind.xvar,
yvar=k, col1=1:3, pos=mypos[1:3], name.dat=names(x)[1:3])})
mapply(FUN=function(x) {foo_reg(dat=x, name.dat=names(x)[1:3])},
mydata_list, col1=1:3, pos=mypos, 

Re: [R] accuracy of measurements

2011-02-25 Thread Marc Schwartz
On Feb 24, 2011, at 4:50 PM, Denis Kazakiewicz wrote:

 Dear R people
 Could you please help with following
 
 Trying to compare accuracy of tumor size evaluation by different
 methods. So data looks like
 
 id true metod1 method2 ... 
 1  2  2   2.5
 2  1.52   2
 3  2  2   2
 
 etc.
 
 Could you please give a hint how to deal with that.
 Seems like {merror} does not suite to me because I am trying to compare
 accuracy of measurements with their true known values not just overall
 agreement of methods.
 Moreover sample size is ridiculously small (33 patients) so ANOVA is not
 much of help (or is it?)
 Any suggestions, hints and even guesses are highly appreciated. I am
 stuck a bit.


Denis,

I would suggest that you start here:

  http://www-users.york.ac.uk/~mb55/meas/meas.htm

This covers various resources pertaining to the design and analysis of 
measurement studies, primarily based upon methods by Bland and Altman.

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] accuracy of measurements

2011-02-25 Thread Dennis Murphy
And in that vein, the recently released MethComp package by Bendix
Carstensen may be of service.

HTH,
Dennis

On Fri, Feb 25, 2011 at 5:39 AM, Marc Schwartz marc_schwa...@me.com wrote:

 On Feb 24, 2011, at 4:50 PM, Denis Kazakiewicz wrote:

  Dear R people
  Could you please help with following
 
  Trying to compare accuracy of tumor size evaluation by different
  methods. So data looks like
 
  id true metod1 method2 ...
  1  2  2   2.5
  2  1.52   2
  3  2  2   2
 
  etc.
 
  Could you please give a hint how to deal with that.
  Seems like {merror} does not suite to me because I am trying to compare
  accuracy of measurements with their true known values not just overall
  agreement of methods.
  Moreover sample size is ridiculously small (33 patients) so ANOVA is not
  much of help (or is it?)
  Any suggestions, hints and even guesses are highly appreciated. I am
  stuck a bit.


 Denis,

 I would suggest that you start here:

  http://www-users.york.ac.uk/~mb55/meas/meas.htm

 This covers various resources pertaining to the design and analysis of
 measurement studies, primarily based upon methods by Bland and Altman.

 HTH,

 Marc Schwartz

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] BFGS versus L-BFGS-B

2011-02-25 Thread Ben Bolker
Brian Tsai btsai00 at gmail.com writes:

 
 Hi all,
 
 I'm trying to figure out the effective differences between BFGS and L-BFGS-B
 are, besides the obvious that L-BFGS-B should be using a lot less memory,
 and the user can provide box constraints.
 
 1) Why would you ever want to use BFGS, if L-BFGS-B does the same thing but
 use less memory?

  L-BFGS-B is a bit more finicky: for example, it does not allow
non-finite (infinite or NA) return values from the objective function,
while BFGS does (although neither does during the initial function evaluation).
I don't know offhand of other differences, although speed may differ.

 2) If i'm optimizing with respect to a variable x that must be non-negative,
 a common approach is to do a change of variables x = exp(y), and optimize
 unconstrained with respect to y.  Is optimization using box constraints on
 x, likely to produce as good a result as unconstrained optimization on y?

  It depends.  If the optimal solution is on the boundary (i.e. x=0)
then optimization on the transformed variable (I think you mean y=exp(x)
above?) will work very badly. On the other hand, if the solution is in 
the interior then transforming sometimes works even better -- for example,
the goodness-of-fit surface may be closer to quadratic (which sometimes
has advantages in terms of inference) with the transformed than the
untransformed parameter.

  Ben Bolker

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Visualizing Points on a Sphere

2011-02-25 Thread Duncan Murdoch

On 25/02/2011 8:21 AM, Lorenzo Isella wrote:

Dear All,
I need to plot some points on the surface of a sphere, but I am not sure
about how to proceed to achieve this in R (or if it is suitable for this
at all).
In any case, I am not looking for really fancy visualizations; for
instance you can consider the images between formulae 5 and 6 at

http://bit.ly/hOgK9h

Any suggestion is appreciated.



Those plots show simple linear projections of the points, after culling 
those that are on the far side of the sphere.  That's very easy for the 
points, slightly more work for the grid.  I'm not aware of any package 
that implements all of it, but you could do it yourself fairly easily.


If you want something more fancy you could use the rgl package for 3d 
plots that you can rotate.  You'll still have to draw the grid, and 
you'll probably find it a little painful to implement the hidden surface 
removal:  rgl uses depth checking to remove things, and because of 
rounding error it's not very good at drawing points and lines on 
surfaces.  (There are new options to control depth checking; see 
depth_mask and depth_test in ?material3d.  You can probably improve 
the default behaviour using those).


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Visualizing Points on a Sphere

2011-02-25 Thread Matt Shotwell
That's interesting. You might also like:
http://en.wikipedia.org/wiki/Von_Mises%E2%80%93Fisher_distribution

I'm not sure how to plot the wireframe sphere, but you can visualize the
points by transforming to Cartesian coordinates like so:

u - runif(1000,0,1)
v - runif(1000,0,1)
theta - 2 * pi * u
phi   - acos(2 * v - 1)
x - sin(theta) * cos(phi)
y - sin(theta) * sin(phi)
z - cos(theta)
library(lattice)
cloud(z ~ x + y)

-Matt

On Fri, 2011-02-25 at 14:21 +0100, Lorenzo Isella wrote:
 Dear All,
 I need to plot some points on the surface of a sphere, but I am not sure 
 about how to proceed to achieve this in R (or if it is suitable for this 
 at all).
 In any case, I am not looking for really fancy visualizations; for 
 instance you can consider the images between formulae 5 and 6 at
 
 http://bit.ly/hOgK9h
 
 Any suggestion is appreciated.
 Cheers
 
 Lorenzo
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Interpreting the example given by Prof Frank Harrell in {Design} validate.cph

2011-02-25 Thread Frank Harrell

Here's the way I would explore this, and some of the code is made more tidy. 
Note that also you could vectorize your simulation.  I have used set.seed
multiple times to make bootstrap samples the same across runs.  -Frank

. . .
if (data[i, 3] == 4) data[i, 5] - sample(c(0, 1), 1, prob=c(.06,  .94))} 

d - data.frame(tumor=factor(data[,1]), ecog=factor(data[,2]),
rx=factor(data[,3]), os=data[,4], censor=data[,5])
S - with(d, Surv(os, censor))

## Check collinearity of rx with other predictors
lrm(rx ~ tumor*ecog, data=d)
## What is the marginal strength of rx (assuming PH)?
cph(S ~ rx, data=d)
## What is partial effect of rx (assuming PH)?
anova(cph(S ~ tumor + ecog + rx, data=d))
## What is combined partial effect of tumor and ecog adjusting for rx?
anova(cph(S ~ tumor + ecog + strat(rx), data=d), tumor, ecog) ## nothing but
noise
## What is their effect not adjusting for rx
cph(S ~ tumor + ecog, data=d)  ## huge

f - cph(S ~ tumor + ecog, x=TRUE, y=TRUE, surv=TRUE, data=d)
set.seed(1)
validate(f, B=100, dxy=TRUE)
w - rep(1, 1000)   #  only one stratum, doesn't change model
f - cph(S ~ tumor + ecog + strat(w), x=TRUE, y=TRUE, surv=TRUE, data=d)
set.seed(1)
validate(f, B=100, dxy=TRUE, u=60)
## identical to last validate except for -Dxy

f - cph(S ~ tumor + ecog + strat(rx), x=TRUE, y=TRUE, surv=TRUE,
time.inc=60, data=d)
set.seed(1)
validate(f, B=100)  ## no predictive ability
set.seed(1)
validate(f, B=100, dxy=TRUE, u=60)
## Only Dxy indicates some predictive information; large in abs. value
## than model ignoring rx (0.3842 vs. 0.3177)




-
Frank Harrell
Department of Biostatistics, Vanderbilt University
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Interpreting-the-example-given-by-Prof-Frank-Harrell-in-Design-validate-cph-tp3316820p3324516.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lm - log(variable) - skip log(0)

2011-02-25 Thread agent dunham

Apologies, I'm really new with R, Can you help me with the syntax?

here is my data.frame in which I introduce independent variables:

 varind -
 data.frame(datpos$hdom2,datpos$NumPies,datpos$InHart,datpos$CV,datpos$CA,datpos$FCC)

varind has dimensions(194, 6), in case that's necessary. Then I type:

 loglmp4 - lm(log(datpos$IncAltuDom)~log(varind), subset=varind0)

Error en model.frame.default(formula = log(datpos$IncAltuDom) ~ log(varind), 
: 
  invalid type (list) for variable 'log(varind)'

Thanks again,

-- 
View this message in context: 
http://r.789695.n4.nabble.com/lm-log-variable-skip-log-0-tp3324263p3324341.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] nls

2011-02-25 Thread Abeer Fadda
hi,
I would like to find the x value (independent variable) for a certain dependent 
value using the fitted model with nls.
with (predict) I can find y that corresponds to a list of x. I need the other 
way around. can it be done?
thanks,
afadda

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lm - log(variable) - skip log(0)

2011-02-25 Thread agent dunham

Apologies, I'm really new with R, Can you help me with the syntax? 

here is my data.frame in which I introduce independent variables: 

 varind -
 data.frame(datpos$hdom2,datpos$NumPies,datpos$InHart,datpos$CV,datpos$CA,datpos$FCC)

varind has dimensions(194, 6), in case that's necessary. Then I type: 

 loglmp4 - lm(log(datpos$IncAltuDom)~log(varind), subset=varind0)

Error en model.frame.default(formula = log(datpos$IncAltuDom) ~ log(varind), 
: 
  invalid type (list) for variable 'log(varind)' 

Thanks again,u...@host.com
-- 
View this message in context: 
http://r.789695.n4.nabble.com/lm-log-variable-skip-log-0-tp3324263p3324344.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Group rows by common ID and plot?

2011-02-25 Thread Mike Marchywka














 Date: Thu, 24 Feb 2011 13:28:18 -0800
 From: dannyb...@gmail.com
 To: r-help@r-project.org
 Subject: Re: [R] Group rows by common ID and plot?


does this do what you want?

 library(lattice)

df-data.frame(x=1:100,y=1.0/(1:100),f=floor((1:100)/10))

str(df)
'data.frame':   100 obs. of  3 variables:
 $ x: int  1 2 3 4 5 6 7 8 9 10 ...
 $ y: num  1 0.5 0.333 0.25 0.2 ...
 $ f: num  0 0 0 0 0 0 0 0 0 1 ...
 
xyplot(y~x|f,data=df)






 In terms of a reproducible example:

 ProbeSet.ID.F ProbeSet.ID Feature.ID Gene.Symbol X0030V120810.4
 X0143V120110.4 X0258V111710.4 X0283V111710.4 X0430V120710.4 X0472V111610.4
 X0520V111610.4 X0546V113010.4 X0578V111810.4 X0624V111810.4
 7896741_479302 7896741 479302 OR4F17 20
 14 5 4 43 85
 12 14 7 5
 7896741_226901 7896741 226901 OR4F17 15
 73 31 14 32 28
 10 42 11 28
 7896741_2337 7896741 2337 OR4F17 168
 126 111 120 119 84
 149 76 347 88
 7896741_289201 7896742 289201 OR4F18 54
 64 11 6 59 66
 10 50 51 27
 7896741_240730 7896742 240730 OR4F18 38
 158 95 38 59 131
 114 100 102 40
 7896741_776611 7896743 776611 OR4F19 6
 27 7 7 16 105
 35 17 19 23


 ...becomes three panels of a plot, containing the lines:

 Plot 1:

 7896741_479302 7896741 479302 OR4F17 20
 14 5 4 43 85
 12 14 7 5
 7896741_226901 7896741 226901 OR4F17 15
 73 31 14 32 28
 10 42 11 28
 7896741_2337 7896741 2337 OR4F17 168
 126 111 120 119 84
 149 76 347 88

 Plot2:

 7896741_289201 7896742 289201 OR4F18 54
 64 11 6 59 66
 10 50 51 27
 7896741_240730 7896742 240730 OR4F18 38
 158 95 38 59 131
 114 100 102 40

 Plot 3:
 7896741_776611 7896743 776611 OR4F19 6
 27 7 7 16 105
 35 17 19 23

 and so on...

 Any ideas much appreciated.
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Group-rows-by-common-ID-and-plot-tp3321955p3323465.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
  
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] group by in data.frame

2011-02-25 Thread zem

Hi Ivan,

thanks for your replay!
but the problem is there that the dataframe has 2 rows and  ca. 2000
groups, but i dont have the column with the groupnames, because the groups
are depending on 2 onother columns ... 
any other idea or i didnt understand waht are you posted ... :( 
-- 
View this message in context: 
http://r.789695.n4.nabble.com/group-by-in-data-frame-tp3324240p3324327.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Group rows by common ID and plot?

2011-02-25 Thread Scott Chamberlain
I imagine you want the ggplot2 package. 

something like:

ggplot(dataframe, aes(x = yourxvar, y = youryvar)) +
geom_point() +
facet_wrap(~ ProbeSet.ID) 

Or facet_grid(), either of which makes a different panel for each unique level 
of ProbeSet.ID

see gggplot help here: http://had.co.nz/ggplot2/
On Thursday, February 24, 2011 at 3:28 PM, DB1984 wrote:

 In terms of a reproducible example:
 
  ProbeSet.ID.F ProbeSet.ID Feature.ID Gene.Symbol X0030V120810.4
 X0143V120110.4 X0258V111710.4 X0283V111710.4 X0430V120710.4 X0472V111610.4
 X0520V111610.4 X0546V113010.4 X0578V111810.4 X0624V111810.4
 7896741_479302 7896741 479302 OR4F17 20 
 14 5 4 43 85 
 12 14 7 5
 7896741_226901 7896741 226901 OR4F17 15 
 73 31 14 32 28 
 10 42 11 28
 7896741_2337 7896741 2337 OR4F17 168 
 126 111 120 119 84 
 149 76 347 88
 7896741_289201 7896742 289201 OR4F18 54 
 64 11 6 59 66 
 10 50 51 27
 7896741_240730 7896742 240730 OR4F18 38 
 158 95 38 59 131 
 114 100 102 40
 7896741_776611 7896743 776611 OR4F19 6 
 27 7 7 16 105 
 35 17 19 23
 
 
 ...becomes three panels of a plot, containing the lines:
 
 Plot 1:
 
 7896741_479302 7896741 479302 OR4F17 20 
 14 5 4 43 85 
 12 14 7 5
 7896741_226901 7896741 226901 OR4F17 15 
 73 31 14 32 28 
 10 42 11 28
 7896741_2337 7896741 2337 OR4F17 168 
 126 111 120 119 84 
 149 76 347 88
 
 Plot2: 
 
 7896741_289201 7896742 289201 OR4F18 54 
 64 11 6 59 66 
 10 50 51 27
 7896741_240730 7896742 240730 OR4F18 38 
 158 95 38 59 131 
 114 100 102 40
 
 Plot 3:
 7896741_776611 7896743 776611 OR4F19 6 
 27 7 7 16 105 
 35 17 19 23
 
 and so on...
 
 Any ideas much appreciated. 
 -- 
 View this message in context: 
 http://r.789695.n4.nabble.com/Group-rows-by-common-ID-and-plot-tp3321955p3323465.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] group by in data.frame

2011-02-25 Thread zem

10x i solved it ... mein problem was that i had 2 column by them i have to
group, i just pasted the values together so that at the end i have one
column to group and then was easy ... 
here is the script that i used:
http://tolstoy.newcastle.edu.au/R/help/06/07/30184.html
Ivan thanks for the help too :) 
-- 
View this message in context: 
http://r.789695.n4.nabble.com/group-by-in-data-frame-tp3324240p3324469.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error

2011-02-25 Thread mathijsdevaan

Hi, I am running the following script for a different (much larger data
frame):

DF = data.frame(read.table(textConnection(A  B  C  D  E 
1 1  a  1999  1  0 
2 1  b  1999  0  1 
3 1  c  1999  0  1 
4 1  d  1999  1  0 
5 2  c  2001  1  0 
6 2  d  2001  0  1 
7 3  a  2004  0  1 
8 3  b  2004  0  1 
9 3  d  2004  0  1 
10 4  b  2001  1  0 
11 4  c  2001  1  0 
12 4  d  2001  0  1),head=TRUE,stringsAsFactors=FALSE)) 
DF-DF[order(DF$B,DF$C),]#order by developer_id and year 
f- function(x) 
{ 
unlist(lapply(x, FUN = function(z) cumsum(z) - z)) 
}
DF-cbind(DF[,c(1:3)],ave(DF[, c(4:5)],DF$B, FUN = f))

I get the following error:

Error in `[-.data.frame`(`*tmp*`, i, , value = integer(0)) : 
  replacement has 0 items, need 37597770
In addition: Warning message:
In max(i) : no non-missing arguments to max; returning -Inf

The dimensions of the data frame are (5,108), so the last line of the
script becomes:

DF-cbind(DF[,c(1:3)],ave(DF[, c(4:108)],DF$B, FUN = f))

Any idea how to solve this problem? Thanks!
 

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Error-tp3324531p3324531.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] kohonen: Argument data should be numeric

2011-02-25 Thread Jay
Hi,

I'm trying to utilize the kohonen package to build SOM's. However,
trying this on my data I get the error:

Argument data should be numeric

when running the som(data.train, grid = somgrid(6, 6, hexagonal))
function. As you see, there is a problem with the data type of
data.train which is a list. When I try to convert it to numeric I
get the error:

(list) object cannot be coerced to type 'double'

What should I do? I can convert the data.train if I take only one
column of the list: data.train[[1]], but that is naturally not what I
want. How did I end up with this data format?

What I did:
data1 - read.csv(data1.txt, sep = ;)
training - sample(nrow(data1), 1000)
data.train - data1[training,2:20]

I tried to use scan as the import method (read about this somewhere)
and unlist, but I'm not really sure how I should get it to numeric/
working.



Thanks,
Jay

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error

2011-02-25 Thread Mathijs de Vaan
Sorry for being unclear: the example works fine on my machine too. However,
with the much larger dataset (dim(5,108)) I get the reported error.

Mathijs

On Fri, Feb 25, 2011 at 3:56 PM, Scott Chamberlain 
scttchamberla...@gmail.com wrote:

  Works fine on my machine:
  DF
A BC D E
 1  1 a 1999 0 0
 2  1 b 1999 0 0
 3  1 c 1999 0 0
 4  1 d 1999 0 0
 5  2 c 2001 0 1
 6  2 d 2001 1 0
 7  3 a 2004 1 0
 8  3 b 2004 0 1
 9  3 d 2004 1 1
 10 4 b 2001 0 2
 11 4 c 2001 1 1
 12 4 d 2001 1 2


 here's my session info:

  sessionInfo()
 R version 2.12.1 (2010-12-16)
 Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

 locale:
 [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base

 other attached packages:
 [1] phyloch_1.4.48   XML_3.2-0colorspace_1.0-1 phangorn_1.3-1
 ape_2.6-2
 [6] quadprog_1.5-3   plyr_1.4

 loaded via a namespace (and not attached):
 [1] gee_4.13-16 grid_2.12.1 lattice_0.19-17 nlme_3.1-97
 tools_2.12.1

 On Friday, February 25, 2011 at 8:31 AM, mathijsdevaan wrote:


 Hi, I am running the following script for a different (much larger data
 frame):

 DF = data.frame(read.table(textConnection( A B C D E
 1 1 a 1999 1 0
 2 1 b 1999 0 1
 3 1 c 1999 0 1
 4 1 d 1999 1 0
 5 2 c 2001 1 0
 6 2 d 2001 0 1
 7 3 a 2004 0 1
 8 3 b 2004 0 1
 9 3 d 2004 0 1
 10 4 b 2001 1 0
 11 4 c 2001 1 0
 12 4 d 2001 0 1),head=TRUE,stringsAsFactors=FALSE))
 DF-DF[order(DF$B,DF$C),]#order by developer_id and year
 f- function(x)
 {
 unlist(lapply(x, FUN = function(z) cumsum(z) - z))
 }
 DF-cbind(DF[,c(1:3)],ave(DF[, c(4:5)],DF$B, FUN = f))

 I get the following error:

 Error in `[-.data.frame`(`*tmp*`, i, , value = integer(0)) :
 replacement has 0 items, need 37597770
 In addition: Warning message:
 In max(i) : no non-missing arguments to max; returning -Inf

 The dimensions of the data frame are (5,108), so the last line of the
 script becomes:

 DF-cbind(DF[,c(1:3)],ave(DF[, c(4:108)],DF$B, FUN = f))

 Any idea how to solve this problem? Thanks!


 --
 View this message in context:
 http://r.789695.n4.nabble.com/Error-tp3324531p3324531.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error

2011-02-25 Thread Ivan Calandra

Hi,

I don't get any error...
DF - cbind(DF[,c(1:3)],ave(DF[, c(4:5)],DF$B, FUN = f))
DF
   A BC D E
1  1 a 1999 0 0
7  3 a 2004 1 0
2  1 b 1999 0 0
10 4 b 2001 0 1
8  3 b 2004 1 1
3  1 c 1999 0 0
5  2 c 2001 0 1
11 4 c 2001 1 1
4  1 d 1999 0 0
6  2 d 2001 1 0
12 4 d 2001 1 1
9  3 d 2004 1 2

Ivan


Le 2/25/2011 15:31, mathijsdevaan a écrit :

Hi, I am running the following script for a different (much larger data
frame):

DF = data.frame(read.table(textConnection(A  B  C  D  E
1 1  a  1999  1  0
2 1  b  1999  0  1
3 1  c  1999  0  1
4 1  d  1999  1  0
5 2  c  2001  1  0
6 2  d  2001  0  1
7 3  a  2004  0  1
8 3  b  2004  0  1
9 3  d  2004  0  1
10 4  b  2001  1  0
11 4  c  2001  1  0
12 4  d  2001  0  1),head=TRUE,stringsAsFactors=FALSE))
DF-DF[order(DF$B,DF$C),]#order by developer_id and year
f- function(x)
{
unlist(lapply(x, FUN = function(z) cumsum(z) - z))
}
DF-cbind(DF[,c(1:3)],ave(DF[, c(4:5)],DF$B, FUN = f))

I get the following error:

Error in `[-.data.frame`(`*tmp*`, i, , value = integer(0)) :
   replacement has 0 items, need 37597770
In addition: Warning message:
In max(i) : no non-missing arguments to max; returning -Inf

The dimensions of the data frame are (5,108), so the last line of the
script becomes:

DF-cbind(DF[,c(1:3)],ave(DF[, c(4:108)],DF$B, FUN = f))

Any idea how to solve this problem? Thanks!




--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fitting distribution in range

2011-02-25 Thread saray

Hello.

I am trying to fit my data sample x with different distributions such that
the integral from min(x) to max(x) of the fitted distribution will be one.
Therefore I have wrote my own log-likelihood functions and then I am using
mle {stats4}. So, for example: 

ll_gamma - function(a,b) {
integrand - function(y){dgamma(y, shape=a, rate=b)}
integ_res - tryCatch({integrate(integrand,min_x,max_x)$value},
error=function(err){0});
if (integ_res == 0) { return(NA) }
C = 1 / integ_res
res = -(sum(log(C*dgamma(x,shape=a,rate=b
return(res)
}

m - mean(x)
v - var(x)
fit - mle(minuslog=ll_gamma,start=list(a=m^2/v,b=m/v)) 

Now, for some reason I get very weird results. I have tested it by sampling
random numbers from gamma distribution, for example, and then try to fit it
with the algorithm I wrote.

Am I doing something wrong? do I need to first fit the sample with regular
gamma distribution and then calculate the normalisation factor C (I think
not. i.e. - I think that the normalisation factor should be included in the
log-likelihood function). Please note that I don't know what is the best fit
for my data. So I am trying to fit it with several distributions and choose
the best using AIC.

Any comments will be very appreciated. 
Please let me know if any of you have ever ran into a similar problem

Thank you in advance,
Saray







-- 
View this message in context: 
http://r.789695.n4.nabble.com/Fitting-distribution-in-range-tp3324579p3324579.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error

2011-02-25 Thread Scott Chamberlain
Works fine on my machine:
 DF
 A B C D E
1 1 a 1999 0 0
2 1 b 1999 0 0
3 1 c 1999 0 0
4 1 d 1999 0 0
5 2 c 2001 0 1
6 2 d 2001 1 0
7 3 a 2004 1 0
8 3 b 2004 0 1
9 3 d 2004 1 1
10 4 b 2001 0 2
11 4 c 2001 1 1
12 4 d 2001 1 2



here's my session info:

 sessionInfo()
R version 2.12.1 (2010-12-16)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats  graphics grDevices utils  datasets methods  base 

other attached packages:
[1] phyloch_1.4.48  XML_3.2-0 colorspace_1.0-1 phangorn_1.3-1  ape_2.6-2 
[6] quadprog_1.5-3  plyr_1.4 

loaded via a namespace (and not attached):
[1] gee_4.13-16  grid_2.12.1  lattice_0.19-17 nlme_3.1-97  tools_2.12.1 
On Friday, February 25, 2011 at 8:31 AM, mathijsdevaan wrote:

 Hi, I am running the following script for a different (much larger data
 frame):
 
 DF = data.frame(read.table(textConnection( A B C D E 
 1 1 a 1999 1 0 
 2 1 b 1999 0 1 
 3 1 c 1999 0 1 
 4 1 d 1999 1 0 
 5 2 c 2001 1 0 
 6 2 d 2001 0 1 
 7 3 a 2004 0 1 
 8 3 b 2004 0 1 
 9 3 d 2004 0 1 
 10 4 b 2001 1 0 
 11 4 c 2001 1 0 
 12 4 d 2001 0 1),head=TRUE,stringsAsFactors=FALSE)) 
 DF-DF[order(DF$B,DF$C),]#order by developer_id and year 
 f- function(x) 
  { 
  unlist(lapply(x, FUN = function(z) cumsum(z) - z)) 
  }
 DF-cbind(DF[,c(1:3)],ave(DF[, c(4:5)],DF$B, FUN = f))
 
 I get the following error:
 
 Error in `[-.data.frame`(`*tmp*`, i, , value = integer(0)) : 
  replacement has 0 items, need 37597770
 In addition: Warning message:
 In max(i) : no non-missing arguments to max; returning -Inf
 
 The dimensions of the data frame are (5,108), so the last line of the
 script becomes:
 
 DF-cbind(DF[,c(1:3)],ave(DF[, c(4:108)],DF$B, FUN = f))
 
 Any idea how to solve this problem? Thanks!
 
 
 -- 
 View this message in context: 
 http://r.789695.n4.nabble.com/Error-tp3324531p3324531.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] linear model lme4

2011-02-25 Thread Brian Smith
Hi,


I wanted to check the difference in results (using lme4) , if I treated a
particular variable (beadchip) as a random effect vs if I treated it as a
fixed effect.


For the first case, my formula is:


lmer.result - lmer(expression ~ cancerClass + (1|beadchip))


For the second case, I want to do:


lmer.result2 - lmer(expression ~ cancerClass + beadchip)



However, I get an error in the second case:


 Error in lmerFactorList(formula, fr, 0L, 0L):

  No random effects terms specified in formula



Is there any way that I can get lmer() to accept a formula without a random
effect?


many thanks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] group by in data.frame

2011-02-25 Thread zem

Yeah, you are right
i want to post an short example what i want to do .. and in the meantime i
solved the problem ... 
but here is: 
i have something like this dataframe: 
c1-c(1,2,3,2,2,3,1,2,2,2)
c2-c(5,6,7,7,5,7,5,7,6,6)
c3-rnorm(10)
x-cbind(c1,c2,c3)
 x
  c1 c2  c3
 [1,]  1  5  0.08279036
 [2,]  2  6  0.59135988
 [3,]  3  7  1.45520468
 [4,]  2  7 -1.70094640
 [5,]  2  5  0.13065228
 [6,]  3  7 -1.12080980
 [7,]  1  5  0.42779354
 [8,]  2  7 -1.53111972
 [9,]  2  6  0.29299987
[10,]  2  6 -0.01602095

#whith aggregate i receive this:
aggregate(x[,3],list(x[,1],x[,2]),mean)
  Group.1 Group.2  x
1   1   5  0.2552920
2   2   5  0.1306523
3   2   6  0.2894463
4   2   7 -1.6160331
5   3   7  0.1671974


and the problem was that i was grouping by 2 columns, so i couldn't copy the
result to x.

the solution was i made another column with paste(x[,1],x[,2],sep=_)
and then i used the solution from this link:
http://tolstoy.newcastle.edu.au/R/help/06/07/30184.html
so i solved my problem

Ivan, many thanks for your support and quik responses! :) 

-- 
View this message in context: 
http://r.789695.n4.nabble.com/group-by-in-data-frame-tp3324240p3324608.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] aggregate with cumsum

2011-02-25 Thread stephenb

Bill,

what will be the fastest way to output not just single lines but small data
frames of about 60 rows?

I prefer writing to a text file because the final output is large 47k times
60 rows and since I do not know the size of it I have to use rbind to build
the object which creates the memory problems described here:

http://www.matthewckeller.com/html/memory.html

look at the swiss cheese paragraph.

kind regards
Stephen
-- 
View this message in context: 
http://r.789695.n4.nabble.com/aggregate-with-cumsum-tp2992383p3324610.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] using scatter plot for different values

2011-02-25 Thread amir
Hi everyone,

I have two different results which is determined by (x,y)
 x1 - c(1,5,8)
 y1 - c(8,9,10)
 x2 - c(1,7,9)
 y2 - c(5,7,9)

Let call one=(x1,y1) and Two=(x2,y2)

how can I draw them in R in a scatter plot showing (x,y) with two
different legends (One, Two)

Regards,
Amir

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] neural networks with RSNNS

2011-02-25 Thread Sara Szeremeta
Hello All!

 I am training to train a NN with function train() after splitting data with
the function splitForTrainingAndTest(). The split is ok (checked it), but
when I get a try on training I get this message:

Error in UseMethod(train) :
  no applicable method for 'train' applied to an object of class
c('double', 'numeric')

The input data are logrithms of some financial values and their first lags.


Does anybody can give me a hint how to make the train() function work
correctly?



Thank you and have a good day!

Sara

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lm - log(variable) - skip log(0)

2011-02-25 Thread David Winsemius


On Feb 25, 2011, at 7:25 AM, agent dunham wrote:



Apologies, I'm really new with R, Can you help me with the syntax?

here is my data.frame in which I introduce independent variables:


varind -
data.frame(datpos$hdom2,datpos$NumPies,datpos$InHart,datpos 
$CV,datpos$CA,datpos$FCC)


varind has dimensions(194, 6), in case that's necessary. Then I type:


loglmp4 - lm(log(datpos$IncAltuDom)~log(varind), subset=varind0)


Because varind is now a dataframe, you need to refer to its columns  
when offering candidate independent variables to lm. It is not clear  
which column you wanted to test for positivity and  which use on the  
RHS from varind. You should also get in the habit of:


--- including context in followup questions
--- using the data= argument in model construction

Going back to your original question where the dataframe was named  
datand it was clear what variable you wanted on the RHS:


lmod1.lm - lm( log(inaltu)~log(indiam), data= dat, subset=(indiam  0  
 inaltu  0)  )


--
David.




Error en model.frame.default(formula = log(datpos$IncAltuDom) ~  
log(varind),

:
 invalid type (list) for variable 'log(varind)'

Thanks again,u...@host.com
--
View this message in context: 
http://r.789695.n4.nabble.com/lm-log-variable-skip-log-0-tp3324263p3324344.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] group by in data.frame

2011-02-25 Thread Ivan Calandra
Ok, now I think I've understood, but I'm not sure since I think that my 
ave() solution does work. Although, I though you have several numerical 
variables and 1 factor; it is the opposite but it is still possible:


c3_mean - ave(x[,3], list(x[,1],x[,2]), FUN=mean)  #note that values 
are different because of rnorm()

cbind(x[,1:2], c3_mean)

Is it what you want?

Ivan


Le 2/25/2011 16:14, zem a écrit :

Yeah, you are right
i want to post an short example what i want to do .. and in the meantime i
solved the problem ...
but here is:
i have something like this dataframe:
c1-c(1,2,3,2,2,3,1,2,2,2)
c2-c(5,6,7,7,5,7,5,7,6,6)
c3-rnorm(10)
x-cbind(c1,c2,c3)

x

   c1 c2  c3
  [1,]  1  5  0.08279036
  [2,]  2  6  0.59135988
  [3,]  3  7  1.45520468
  [4,]  2  7 -1.70094640
  [5,]  2  5  0.13065228
  [6,]  3  7 -1.12080980
  [7,]  1  5  0.42779354
  [8,]  2  7 -1.53111972
  [9,]  2  6  0.29299987
[10,]  2  6 -0.01602095

#whith aggregate i receive this:

aggregate(x[,3],list(x[,1],x[,2]),mean)

   Group.1 Group.2  x
1   1   5  0.2552920
2   2   5  0.1306523
3   2   6  0.2894463
4   2   7 -1.6160331
5   3   7  0.1671974


and the problem was that i was grouping by 2 columns, so i couldn't copy the
result to x.

the solution was i made another column with paste(x[,1],x[,2],sep=_)
and then i used the solution from this link:
http://tolstoy.newcastle.edu.au/R/help/06/07/30184.html
so i solved my problem

Ivan, many thanks for your support and quik responses! :)



--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] group by in data.frame

2011-02-25 Thread David Winsemius


On Feb 25, 2011, at 10:14 AM, zem wrote:



Yeah, you are right
i want to post an short example what i want to do .. and in the  
meantime i

solved the problem ...
but here is:
i have something like this dataframe:
c1-c(1,2,3,2,2,3,1,2,2,2)
c2-c(5,6,7,7,5,7,5,7,6,6)
c3-rnorm(10)
x-cbind(c1,c2,c3)

x

 c1 c2  c3
[1,]  1  5  0.08279036
[2,]  2  6  0.59135988
[3,]  3  7  1.45520468
[4,]  2  7 -1.70094640
[5,]  2  5  0.13065228
[6,]  3  7 -1.12080980
[7,]  1  5  0.42779354
[8,]  2  7 -1.53111972
[9,]  2  6  0.29299987
[10,]  2  6 -0.01602095

#whith aggregate i receive this:

aggregate(x[,3],list(x[,1],x[,2]),mean)

 Group.1 Group.2  x
1   1   5  0.2552920
2   2   5  0.1306523
3   2   6  0.2894463
4   2   7 -1.6160331
5   3   7  0.1671974


and the problem was that i was grouping by 2 columns, so i couldn't  
copy the

result to x.

the solution was i made another column with paste(x[,1],x[,2],sep=_)
and then i used the solution from this link:
http://tolstoy.newcastle.edu.au/R/help/06/07/30184.html
so i solved my problem


Right. That works and has the virtue that it is reasonably clear what  
is going on. Another approach, possibly even more clear and even more  
R-ish, is to use the interaction() function.


 aggregate(x[,3], list(interaction(x[,1],x[,2]) ), mean)
  Group.1x
1 1.5 -0.658932424
2 2.5  0.824756795
3 2.6  0.640471421
4 2.7 -0.008519716
5 3.7 -0.053233855




Ivan, many thanks for your support and quik responses! :)

--
View this message in context: 
http://r.789695.n4.nabble.com/group-by-in-data-frame-tp3324240p3324608.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] using scatter plot for different values

2011-02-25 Thread David Winsemius


On Feb 25, 2011, at 10:26 AM, amir wrote:


Hi everyone,

I have two different results which is determined by (x,y)

x1 - c(1,5,8)
y1 - c(8,9,10)
x2 - c(1,7,9)
y2 - c(5,7,9)


Let call one=(x1,y1) and Two=(x2,y2)

how can I draw them in R in a scatter plot showing (x,y) with two
different legends (One, Two)


? points



Regards,
Amir

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] BFGS versus L-BFGS-B

2011-02-25 Thread Prof. John C Nash
There are considerable differences between the algorithms. And BFGS is an 
unfortunate
nomenclature, since there are so many variants that are VERY different. It was 
called
variable metric in my book from which the code was derived, and that code was 
from Roger
Fletcher's Fortran VM code based on his 1970 paper. L-BFGS-B is a later and more
complicated algorithm with some pretty nice properties. The code is much larger.

Re: less memory -- this will depend on the number of parameters, but to my 
knowledge
there are no good benchmark studies of memory and performance. Perhaps someone 
wants to
propose one for Google Summer of Code (see
http://rwiki.sciviews.org/doku.php?id=developers:projects:gsoc2011
).

The optimx package can call Rvmmin which has box constraints (also Rcgmin that 
is intended
for very low memory). Also several other methods with box constraints, 
including L-BFGS-B.
Worth a try if you are seeking a method for multiple production runs. 
Unfortunately, we
seem to have some CRAN check errors on Solaris and some old releases -- 
platforms I do not
have -- so it may be a few days or more until we sort out the issues, which 
seem to be
related to alignment of the underlying packages for which optimx is a wrapper.

Use of transformation can be very effective. But again, I don't think there are 
good
studies on whether use of box constraints or transformations is better and 
when. Another
project, which I have made some tentative beginings to carry out. 
Collaborations welcome.

Best,

JN


On 02/25/2011 06:00 AM, r-help-requ...@r-project.org wrote:
 Message: 86
 Date: Fri, 25 Feb 2011 00:11:59 -0500
 From: Brian Tsai btsa...@gmail.com
 To: r-help@r-project.org
 Subject: [R] BFGS versus L-BFGS-B
 Message-ID:
   aanlktimszvkjbuhv-bbr1easpx9ootjxqcujgujr5...@mail.gmail.com
 Content-Type: text/plain
 
 Hi all,
 
 I'm trying to figure out the effective differences between BFGS and L-BFGS-B
 are, besides the obvious that L-BFGS-B should be using a lot less memory,
 and the user can provide box constraints.
 
 1) Why would you ever want to use BFGS, if L-BFGS-B does the same thing but
 use less memory?
 
 2) If i'm optimizing with respect to a variable x that must be non-negative,
 a common approach is to do a change of variables x = exp(y), and optimize
 unconstrained with respect to y.  Is optimization using box constraints on
 x, likely to produce as good a result as unconstrained optimization on y?
 
 - Brian.
 
   [[alternative HTML version deleted]]
 
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] linear model lme4

2011-02-25 Thread Doran, Harold
No, as the error states, you need random effects in lmer. But, you don't for 
lm() and that is what you're running with no random effects. However, some 
caution is warranted on the comparison.

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
 Behalf Of Brian Smith
 Sent: Friday, February 25, 2011 10:06 AM
 To: r-help@r-project.org
 Subject: [R] linear model lme4
 
 Hi,
 
 
 I wanted to check the difference in results (using lme4) , if I treated a
 particular variable (beadchip) as a random effect vs if I treated it as a
 fixed effect.
 
 
 For the first case, my formula is:
 
 
 lmer.result - lmer(expression ~ cancerClass + (1|beadchip))
 
 
 For the second case, I want to do:
 
 
 lmer.result2 - lmer(expression ~ cancerClass + beadchip)
 
 
 
 However, I get an error in the second case:
 
 
  Error in lmerFactorList(formula, fr, 0L, 0L):
 
   No random effects terms specified in formula
 
 
 
 Is there any way that I can get lmer() to accept a formula without a random
 effect?
 
 
 many thanks
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error: address 0x6951c20, cause 'memory not mapped'

2011-02-25 Thread Jannis
Dear R list,


I get a strange error in R:

 *** caught segfault ***
address 0x6951c20, cause 'memory not mapped'

Traceback:
 1: .C(spline_eval, z$method, nu = as.integer(n), x = as.double(xout), y 
= double(n), z$n, z$x, z$y, z$b, z$c, z$d, PACKAGE = stats)
 2: spline(gam.data$x[, col.data], gam.smooths.all$fit[, m], xout = 
gam.results.global[m, , x.values], ties = mean)
 3: eval.with.vis(expr, envir, enclos)
 4: eval.with.vis(ei, envir)
 5: source(file.path(getwd(), Skripte, r, GAM_hourly, 
1_calcs_GAM_all_sites_hourly.R), echo = TRUE, max.deparse.length = 2e+05)

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace



It seems as whether the error occurs when I try to perform a spline 
interpolation of a smooth function. Can anybody give me some hints on where to 
dig for a solution?

Thanks a lot
Jannis


My R version (if this has anything to do with it):

 sessionInfo()
R version 2.10.1 (2009-12-14) 
x86_64-unknown-linux-gnu 

locale:
 [1] LC_CTYPE=de_DE.UTF-8   LC_NUMERIC=C  
 [3] LC_TIME=de_DE.UTF-8LC_COLLATE=de_DE.UTF-8
 [5] LC_MONETARY=C  LC_MESSAGES=de_DE.UTF-8   
 [7] LC_PAPER=de_DE.UTF-8   LC_NAME=C 
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C   

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 

other attached packages:
[1] mgcv_1.6-2

loaded via a namespace (and not attached):
[1] grid_2.10.1lattice_0.18-3 Matrix_0.999375-43 nlme_3.1-96   
[5] tools_2.10.1






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] group by in data.frame

2011-02-25 Thread Dennis Murphy
Hi:

Here's another way:

c1-c(1,2,3,2,2,3,1,2,2,2)
c2-c(5,6,7,7,5,7,5,7,6,6)
c3-rnorm(10)
x - data.frame(c1 = factor(c1), c2 = factor(c2), c3)
x - transform(x, mean = ave(c3, c1, c2, FUN = mean))

Yet another with function ddply() in package plyr:
ddply(x, .(c1, c2), transform, mean = mean(c3))

HTH,
Dennis


On Fri, Feb 25, 2011 at 7:14 AM, zem zmanol...@gmail.com wrote:


 Yeah, you are right
 i want to post an short example what i want to do .. and in the meantime i
 solved the problem ...
 but here is:
 i have something like this dataframe:
 c1-c(1,2,3,2,2,3,1,2,2,2)
 c2-c(5,6,7,7,5,7,5,7,6,6)
 c3-rnorm(10)
 x-cbind(c1,c2,c3)
  x
  c1 c2  c3
  [1,]  1  5  0.08279036
  [2,]  2  6  0.59135988
  [3,]  3  7  1.45520468
  [4,]  2  7 -1.70094640
  [5,]  2  5  0.13065228
  [6,]  3  7 -1.12080980
  [7,]  1  5  0.42779354
  [8,]  2  7 -1.53111972
  [9,]  2  6  0.29299987
 [10,]  2  6 -0.01602095

 #whith aggregate i receive this:
 aggregate(x[,3],list(x[,1],x[,2]),mean)
  Group.1 Group.2  x
 1   1   5  0.2552920
 2   2   5  0.1306523
 3   2   6  0.2894463
 4   2   7 -1.6160331
 5   3   7  0.1671974


 and the problem was that i was grouping by 2 columns, so i couldn't copy
 the
 result to x.

 the solution was i made another column with paste(x[,1],x[,2],sep=_)
 and then i used the solution from this link:
 http://tolstoy.newcastle.edu.au/R/help/06/07/30184.html
 so i solved my problem

 Ivan, many thanks for your support and quik responses! :)

 --
 View this message in context:
 http://r.789695.n4.nabble.com/group-by-in-data-frame-tp3324240p3324608.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] limma function problem

2011-02-25 Thread Martin Morgan
On 02/25/2011 04:26 AM, Sukhbir Rattan wrote:
 Hi,
 
 I have two data set of normalized Affymetrix CEL files, wild type vs Control
 type.(each set have further three replicates).
 
 
 wild.fish
 AffyBatch object
 size of arrays=712x712 features (10 kb)
 cdf=Zebrafish (15617 affyids)
 number of samples=3
 number of genes=15617
 annotation=zebrafish
 notes=
 Dicer.fish
 AffyBatch object
 size of arrays=712x712 features (10 kb)
 cdf=Zebrafish (15617 affyids)
 number of samples=3
 number of genes=15617
 annotation=zebrafish
 notes=
 
 Now, I have to combine these two S4 objects and use lmFit function of Limma
 package.I am able to combine the two S4 objects using merge function.
 
 
 merge.fish -merge(wild.fish,Dicer.fish)
 merge.fish
 AffyBatch object
 size of arrays=712x712 features (17833 kb)
 cdf=Zebrafish (15617 affyids)
 number of samples=6
 number of genes=15617
 annotation=zebrafish
 notes=Merge from two AffyBatches with notes: 1)  , and 2)
 
 design
  Wild Mz_Dicer
 GSM95623.CEL10
 GSM95624.CEL10
 GSM95625.CEL10
 GSM95617.CEL01
 GSM95618.CEL01
 GSM95619.CEL01
 
 
 fit -lmFit(merge.fish, design)
 Error in as.vector(data) :
   no method for coercing this S4 class to a vector
 
 mode(merge.fish)
 [1] S4
 
 
 So, how to troubleshoot this problem?

Hi Sukhbir -- this is a Bioconductor package, so please ask on that list.

http://bioconductor.org/help/mailing-list/

However, you'll want to review basic microarray analysis work flows in
R, either on the Bioconductor web site

http://bioconductor.org/help/workflows/oligo-arrays/

or other resources, such as the vignette that comes with affy or limma.
What you have is 'raw' data; you want to 'pre-process' it, e.g., by the
RMA algorithm, prior to assessing differential expression. A more
typical work flow might go directly from your 6 CEL files to an
'ExpressionSet' object using RMA normalization, via the single function
call just.rma from the affy package; no need to ReadAffy and merge.

Hope that helps.

Martin

 
 
 Regards,
 Sukhbir Singh Rattan.
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] nls

2011-02-25 Thread Bert Gunter
On Fri, Feb 25, 2011 at 6:09 AM, Abeer Fadda
a.fa...@zmbh.uni-heidelberg.de wrote:
 hi,
 I would like to find the x value (independent variable) for a certain 
 dependent value using the fitted model with nls.
 with (predict) I can find y that corresponds to a list of x. I need the other 
 way around. can it be done?
 thanks,
 afadda

Yes.

-- Bert




... Oh, if you mean HOW can it be done? -- lots of ways. Analytically,
if y=f(x) is your fitted model, just backsolve x = g(y).

If your algebra isn't up to the task, then just predict y on a
suitably fine x grid, and (assuming monotonicity) find the x
corresponding to the predicted y closest to the y you wish to back
calibrate. This is just a matter of indexing. See ?which.





 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Bert Gunter
Genentech Nonclinical Biostatistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Variable names AS variable names?

2011-02-25 Thread Noah Silverman
My actual code is several things with adaptive filtering.  This will
require accessing data sporadically.  The loop was just a quick example
for the e-mail.

One application is to work with online (streaming) data.  If I get a new
data point in for code a1, I'll need to be able to reference the
matrix named a1. 

On 2/25/11 12:23 AM, David Winsemius wrote:

 On Feb 25, 2011, at 1:55 AM, Noah Silverman wrote:

 How can I dynamically use a variable as the name for another variable?

 I realize this sounds cryptic, so an example is best:

 #Start with an array of codes
 codes - c(a1, b24, q99)

 Is there some reason not to use list(a1, b24, q99)? If not then:

 lapply(codes, somefun)



 #Each code has a corresponding matrix (could be vector)
 a1 - matrix(rnorm(100), nrow=10)
 b24 - matrix(rnorm(100), nrow=10)
 q99 - matrix(rnorm(100), nrow=10)

 #Now, I want to loop through all the codes and do something with each
 matrix
 for(code in codes){
#here is where I'm stuck.  I don't want the value of code, but the
 variable who's name is the value of code

 }


 Any suggestions?

 -N

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 David Winsemius, MD
 West Hartford, CT


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Accessing sub diagonals / spdiag in R ?

2011-02-25 Thread Mingo
Hello, I'm attempting to access a specific number of sub diagonals in a
MATRIX and have been accustomed to using spdiags in MATLAB or Octave. I've
got a solution pieced together using for loops and it works though isn't
vectorized and liable to run very slow
for large matrices.

As an example:

A =
1 2 3 4 5
9 8 7 6 5
4 5 6 7 8
5 4 3 2 1
8 7 6 0 1

The subdiagonals are: 9,5,3,0   4,4,6   5,7  and 8,
I know about lower.tri and can fetch the data in a resulting vector
which ,in this case, would be:

9,4,5,8,5,4,7,3,6,0

though I would have to manipulate this some more to extract the other
diagonals (imagine this being done for say a 1000 x 1000 matrix). I looked
at CRAN and didn't see anything corresponding to
spdiags. The closest package appeared to be the one relating to sparse
matrices and band symmetry. Would you have any suggestions about 1) how to
emulate spdiags or 2) working with the lower.tri returned-data and
extracting the remaining diags efficiently. I can live with what I have but
imagine that there is something more direct. Thanks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Forced inclusion of varaibles in validate command as well as step

2011-02-25 Thread Jon Kroll Bjerregaard
Hello all

 

I am a very new R user

 

I am used to using STATA

 

My problem:

 

I want to build a Cox model and validate this.

 

I have a large number of clinical relevant factors and feel the need to
reduce these. Meanwhile I have some clinical variables I deem sufficiently
important to force into the model regardless of AIC or p value.

 

This is my present log over commands



 

library(rms)

library(survival)

library(Hmisc)

 

data1 - read.table(optimism.csv, header=T, sep=,)

attach(data1)

 

coxmodel4 - coxph(formula=Surv(OS,mors) ~
iAJCC2+iAJCC3+iPS2+iPS3++alder_diag+gender+vol_GTV+iforb2+iforb3+hem_LNL+ser
o_thromb+LDH_UNL+ALAT_UNL+BASP_UNL+sero_bili+resection_perf+sero_WBC,
data=data1, x=TRUE, y=TRUE,method=c(efron))

coxmodel.streg-step(coxmodel4)

 

I would like to lock iAJCC2 iAJCC3 and iPS2 + iPS3 regardless, but I
cannot seem to get the step function to accept this.

 

Further

 

Once I have the model I would like to validate it with the validate command

 

I am presently using this

 

fit - cph(formula=Surv(OS,mors) ~
iAJCC2+iAJCC3+iPS2+iPS3+alder_diag+gender+vol_GTV+iforb2+iforb3+hem_LNL+sero
_thromb+LDH_UNL+ALAT_UNL+BASP_UNL+sero_bili+resection_perf+sero_WBC,
data=data1, x=TRUE, y=TRUE)

fit

validate(fit, method=boot, B=40,bw=TRUE, rule=p, type=residual,
sls=0.15, aics=0, pr=TRUE)

 

Due to my small data set 153 patients with 130 events I have chosen to lift
the p limit from 5% to 15% as suggested by Steyerberg.



 

I would appreciate any help with the lock term (also if it cannot be done) 

As I mentioned I am a bit of a rookie, and not too experienced as a
programmer (I am a MD after all)

However I am quite impressed with R so far since I have been trying to get
this far in STATA for a few weeks.

 

Sincerely

 

Jon Kroll Bjerregaard, MD. Dep of Oncology Odense University Hospital

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] help please ..simple question regarding output the p-value inside a function and lm

2011-02-25 Thread Umesh Rosyara
Dear R community members and R experts 

 

I am stuck at a point and I tried with my colleagues and did not get it out.
Sorry, I need your help. 

 

Here my data (just created to show the example):

 

# generating a dataset just to show how my dataset look like, here I have x
variables

# x1 .to X1000 plus ind and y 

ind - c(1:100)

y - rnorm(100, 10,2)

set.seed(201)

P - vector()

dataf1 - as.data.frame(matrix(rep(NA, 10), nrow=100))

dataf - data.frame (dataf1, ind,y)

names(dataf) - (c(paste(x,1:1000, sep=),ind, y))

for(i in 1:1000) {

dataf[,i] - rnorm(100)

}

 

# my intension was to fit a model that would fit the following fashion:

y ~ x1 +x2, y ~ x3+x4, y ~ x5+ x6y ~ x999+x1000 (to end of the
dataframe)

 

# please not that I want to avoid to fit  y ~ x2 + x3 or  y ~ x4 + x5 (means
that I am selecting two x variables at time to end)

# question: how can I do this and put inside a user function as I worked out
the following??? 

 

 

# defining function for lm model 

mylm - function (mydata,nvar) {

y - NULL

P1 - vector (mode=numeric, length = nvar)

P2 - vector (mode=numeric, length = nvar)

for(i in 1: nvar) {

print(P1[i] - summary(lm(mydata$y ~   mydata[,i]) +
mydata[,i+1]$coefficients[2,4]))

print(P2[i] - summary(lm(mydata$y ~   mydata[,i]) +
mydata[,i+1]$coefficients[2,5]))

print(plot(nvar, P1))

print(plot(nvar, P2))

}

} 

 

# applying the function to mydata 

mylm (dataf, 1000)

 

Does not work?? The following is the error message: 

Error in model.frame.default(formula = mydata$y ~ mydata[, i],
drop.unused.levels = TRUE) : 

  invalid type (NULL) for variable 'mydata$y'

 

Please help !

 

Thanks;

 

Umesh R


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error

2011-02-25 Thread mathijsdevaan

I simply don't understand why I get this error when using a larger dataset. 

Error in `[-.data.frame`(`*tmp*`, i, , value = integer(0)) : 
  replacement has 0 items, need 37597770 
In addition: Warning message: 
In max(i) : no non-missing arguments to max; returning -Inf

Any ideas on what this error means? Thanks


-- 
View this message in context: 
http://r.789695.n4.nabble.com/Error-tp3324531p3324859.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ANOVA and Pseudoreplication in R

2011-02-25 Thread Ben Ward
Hi, As part of my dissertation, I'm going to be doing an Anova, 
comparing the dead zone diameters on plates of microbial growth with 
little paper disks loaded with antimicrobial, a clear zone appears 
where death occurs, the size depending on the strength and 
succeptibility. So it's basically 4 different treatments, and I'm 
comparing the diameters (in mm) of circles. I'm concerned however, about 
Pseudoreplication and how to deal with it in R, (I thought of using the 
Error() term.


I have four levels of one factor(called Treatment): NE.Dettol, 
EV.Dettol, NE.Garlic, EV.Garlic.   (NE.Dettol is E.coli not evolved to 
dettol, exposed to dettol to get dead zones. And the same for 
NE.Garlic, but with garlic, not dettol. EV.Dettol is E.coli that has 
been evolved against dettol, and then tested afterwards against dettol 
to get the dead zones. Same applies for EV.Garlic but with garlic).  
You see from the four levels (or treatments) there are two chemicals 
involved. So my first concern is whether they should be analysed using 
two seperate ANOVA's.


NE.Dettol and NE.Garlic are both the same organism - a lab stock E.coli, 
just exposed to two different chemicals.
EV.Dettol and EV.Garlic, are in principle, likely to be two different 
forms of the organism after the many experimental doses of their 
respective chemical.


For NE.Garlic and NE.Dettol I have 5, what I've called Lineages, 
basically seperate bottles of them (10 in total).
Then I have 5 Bottles (Lineages) of EV.Dettol, and 5 of EV.Garlic. - 
This was done because there was the possiblity that, whilst I'm 
expecting them all to respond in a similar manner, there are many 
evolutionary paths to the same result, and previous research and reading 
shows that occasionally one or two react differently to the rest through 
random chance.
The point I observed above (NE.Dettol and NE.Garlic are both the same 
organism...) is also applicable to the 5 bottles: The 5 bottles each of 
NE.Garlic and NE.Dettol are supposed to be all the same organism - from 
a stock one kept in store in the lab.
There is potential though for the 5 of EV.Garlic, to be different from 
one another, and potential for the 5 EV.Dettol to be different from one 
another.


The Lineage (bottle) is also a factor then, with 5 levels (1,2,3,4,5). 
Because they may be different.


To get the measurements of the diamter of the zones. I take out a small 
amount from a tube and spread it on a plate, then take three paper 
disks, soaked in their respective chemical, either Dettol or Garlic. and 
press them and and incubate them.
Then when the zones have appeared after a day or 2. I take 4 diameter 
measurements from each zone, across the zone at different angles, to 
take account for the fact, that there may be a weird shape, or not quite 
circular.


I'm concerned about pseudoreplication, such as the multiple readings 
from one disk, and the 5 lineages - which might be different from one 
another in each of the Two EV. treatments, but not with NE. treatments.


I read that I can remove pseudoreplication from  the multiple readings 
from each disk, by using the 4 readings on each disk, to produce a mean 
for the disks, and analyse those means - Exerciseing caution where there 
are extreme values. I think the 3 disks for each lineage themselves are 
not pseudoreplication, because they are genuinley 3 disks on a plate: 
the Disk Diffusion Test replicated 3 times - but the multiple readings 
from one disk if eel, is pseudoreplication. I've also read about 
including Error() terms in a formula.


I'm unsure of the two NE. Treatments comming from the same culture does 
not introduce pseudoreplications at Treatment Factor Level, because of 
the two different antimicrobials used have two different effects.


I was hoping for a more expert opinion on whether I have identified 
pseudoreplication correctly or if there is indeed pseudoreplication in 
the 5 Lineages or anywhere else I haven't seen. And how best this is 
dealt with in R. At the minute my solution to the multiple readings from 
one disk is to simply make a new factor, with the means on and do Anova 
from that, or even take the means before I even load the dataset into R. 
I'm wondering if an Error() term would be correct.


Thanks,
Ben W.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] accuracy of measurements

2011-02-25 Thread Denis Kazakiewicz
Thank so much to everybody who found time to answer my question
All your messages are of great help.
Good luck

 
 
У Пят, 25/02/2011 у 05:46 -0800, Dennis Murphy піша:
 And in that vein, the recently released MethComp package by Bendix
 Carstensen may be of service.
 
 HTH,
 Dennis
 
 On Fri, Feb 25, 2011 at 5:39 AM, Marc Schwartz marc_schwa...@me.com
 wrote:
 On Feb 24, 2011, at 4:50 PM, Denis Kazakiewicz wrote:
 
  Dear R people
  Could you please help with following
 
  Trying to compare accuracy of tumor size evaluation by
 different
  methods. So data looks like
 
  id true metod1 method2 ...
  1  2  2   2.5
  2  1.52   2
  3  2  2   2
 
  etc.
 
  Could you please give a hint how to deal with that.
  Seems like {merror} does not suite to me because I am trying
 to compare
  accuracy of measurements with their true known values not
 just overall
  agreement of methods.
  Moreover sample size is ridiculously small (33 patients) so
 ANOVA is not
  much of help (or is it?)
  Any suggestions, hints and even guesses are highly
 appreciated. I am
  stuck a bit.
 
 
 
 Denis,
 
 I would suggest that you start here:
 
  http://www-users.york.ac.uk/~mb55/meas/meas.htm
 
 This covers various resources pertaining to the design and
 analysis of measurement studies, primarily based upon methods
 by Bland and Altman.
 
 HTH,
 
 Marc Schwartz
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible
 code.
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculate probabilty

2011-02-25 Thread rex.dwyer
Are you clear about the question you are asking?  Do you want to know whether 
there are 6 balls or at least 6 balls?  (It sounds like at least.)  Do you 
want to know whether there are at least 6 balls in the first box, or at least 6 
balls in exactly one box or at least 6 balls in at least one box?

This is the probability that there are exactly 6 balls in the first box:
 dbinom(6,142,1/491)
[1] 5.53366e-07

This is the probability that there are MORE THAN 6 balls in the first box:  
(NOT at least 6)
 1-pbinom(6,142,1/491)
[1] 2.272026e-08
 sum(sapply(7:142, function(i) dbinom(i,142,1/491)))
[1] 2.272026e-08
 1-sum(sapply(0:6, function(i) dbinom(i,142,1/491)))
[1] 2.272026e-08

This is probability that there are at least 6 balls in the first box:
 1-pbinom(5,142,1/491)
[1] 5.760862e-07

You can get all this from ?dbinom, but it pretty confusing that the argument n 
and the italic n in the details are totally different things, italic n = 
argument size.  (Likewise, italic p = argument prob, not argument p.)

Questions about more than one box are a little harder since the boxes are not 
independent.

HTH,
Rex

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Fabrice Tourre
Sent: Thursday, February 24, 2011 3:51 PM
To: r-help@r-project.org
Subject: [R] Calculate probabilty

Hi List,

I have a question to calculate probability using R.

There are 491 boxes and 142 balles. If the ball randomly put into the
box. How to calculate the probability of six or more there are in one
box?

I have try :

dbinom(6,142,1/491)

1-pbinom(6,142,1/491)

But I think I have some unclear about the dbinom and pbinom.

Thank you very much in advance.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Missing R.h

2011-02-25 Thread Noah Silverman
Hi,

I'm trying to install a module - gputools - and keep getting compile
time errors about missing R.h

Does anyone know where this file can be found?

Thanks!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Variable names AS variable names?

2011-02-25 Thread Jonathan P Daily
To access a variable by a character string name, try

for(code in codes)
{
dat - get(code)
[stuff]
}

Other options include ?assign if you need to manipulate the original, or 
?with to use the subject of codes as an environment.
--
Jonathan P. Daily
Technician - USGS Leetown Science Center
11649 Leetown Road
Kearneysville WV, 25430
(304) 724-4480
Is the room still a room when its empty? Does the room,
 the thing itself have purpose? Or do we, what's the word... imbue it.
 - Jubal Early, Firefly

r-help-boun...@r-project.org wrote on 02/25/2011 12:33:32 PM:

 [image removed] 
 
 Re: [R] Variable names AS variable names?
 
 Noah Silverman 
 
 to:
 
 02/25/2011 12:35 PM
 
 Sent by:
 
 r-help-boun...@r-project.org
 
 Cc:
 
 r-help
 
 My actual code is several things with adaptive filtering.  This will
 require accessing data sporadically.  The loop was just a quick example
 for the e-mail.
 
 One application is to work with online (streaming) data.  If I get a new
 data point in for code a1, I'll need to be able to reference the
 matrix named a1. 
 
 On 2/25/11 12:23 AM, David Winsemius wrote:
 
  On Feb 25, 2011, at 1:55 AM, Noah Silverman wrote:
 
  How can I dynamically use a variable as the name for another 
variable?
 
  I realize this sounds cryptic, so an example is best:
 
  #Start with an array of codes
  codes - c(a1, b24, q99)
 
  Is there some reason not to use list(a1, b24, q99)? If not then:
 
  lapply(codes, somefun)
 
 
 
  #Each code has a corresponding matrix (could be vector)
  a1 - matrix(rnorm(100), nrow=10)
  b24 - matrix(rnorm(100), nrow=10)
  q99 - matrix(rnorm(100), nrow=10)
 
  #Now, I want to loop through all the codes and do something with each
  matrix
  for(code in codes){
 #here is where I'm stuck.  I don't want the value of code, but the
  variable who's name is the value of code
 
  }
 
 
  Any suggestions?
 
  -N
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
  David Winsemius, MD
  West Hartford, CT
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Accessing sub diagonals / spdiag in R ?

2011-02-25 Thread Gabor Grothendieck
On Fri, Feb 25, 2011 at 11:26 AM, Mingo catojo...@gmail.com wrote:
 Hello, I'm attempting to access a specific number of sub diagonals in a
 MATRIX and have been accustomed to using spdiags in MATLAB or Octave. I've
 got a solution pieced together using for loops and it works though isn't
 vectorized and liable to run very slow
 for large matrices.

 As an example:

 A =
 1 2 3 4 5
 9 8 7 6 5
 4 5 6 7 8
 5 4 3 2 1
 8 7 6 0 1

 The subdiagonals are: 9,5,3,0   4,4,6   5,7  and 8,
 I know about lower.tri and can fetch the data in a resulting vector
 which ,in this case, would be:

 9,4,5,8,5,4,7,3,6,0

 though I would have to manipulate this some more to extract the other
 diagonals (imagine this being done for say a 1000 x 1000 matrix). I looked
 at CRAN and didn't see anything corresponding to
 spdiags. The closest package appeared to be the one relating to sparse
 matrices and band symmetry. Would you have any suggestions about 1) how to
 emulate spdiags or 2) working with the lower.tri returned-data and
 extracting the remaining diags efficiently. I can live with what I have but
 imagine that there is something more direct. Thanks

A[ col(A) == row(A) - i ] is the ith subdiagonal

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Variable names AS variable names?

2011-02-25 Thread David Winsemius


On Feb 25, 2011, at 12:33 PM, Noah Silverman wrote:


My actual code is several things with adaptive filtering.  This will
require accessing data sporadically.  The loop was just a quick  
example

for the e-mail.

One application is to work with online (streaming) data.  If I get a  
new

data point in for code a1, I'll need to be able to reference the
matrix named a1.


So, how do these things you are calling codes get their names?

(code is not an R datatype. a1 is a matrix,  a1 is a character  
value and it would be returned by names(a1).  )


a1 - matrix(rnorm(100), nrow=10)
b24 - matrix(rnorm(100), nrow=10)
q99 - matrix(rnorm(100), nrow=10)

codes - list(a1=a1, b24=b24, q99=q99)

str(codes[['a1']])
 ... should be a matrix

Assignment also works with [[ or with [.

We really _do_ need  examples that represent the problems posed on the  
list. You have been posting a sufficient number of times to have  
understood this by now.


-- David




On 2/25/11 12:23 AM, David Winsemius wrote:


On Feb 25, 2011, at 1:55 AM, Noah Silverman wrote:

How can I dynamically use a variable as the name for another  
variable?


I realize this sounds cryptic, so an example is best:

#Start with an array of codes



Is there some reason not to use list(a1, b24, q99)? If not then:

lapply(codes, somefun)




#Each code has a corresponding matrix (could be vector)
a1 - matrix(rnorm(100), nrow=10)
b24 - matrix(rnorm(100), nrow=10)
q99 - matrix(rnorm(100), nrow=10)

#Now, I want to loop through all the codes and do something with  
each

matrix
for(code in codes){
  #here is where I'm stuck.  I don't want the value of code, but the
variable who's name is the value of code

}


Any suggestions?

-N



David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Interpreting the example given by Prof Frank Harrell in {Design} validate.cph

2011-02-25 Thread Frank Harrell

P.S. I used the latest version of the rms package to run this.  The Design
package is no longer supported.

Frank

-
Frank Harrell
Department of Biostatistics, Vanderbilt University
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Interpreting-the-example-given-by-Prof-Frank-Harrell-in-Design-validate-cph-tp3316820p3325050.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Variable names AS variable names?

2011-02-25 Thread David Winsemius


On Feb 25, 2011, at 1:09 PM, David Winsemius wrote:



On Feb 25, 2011, at 12:33 PM, Noah Silverman wrote:


My actual code is several things with adaptive filtering.  This will
require accessing data sporadically.  The loop was just a quick  
example

for the e-mail.

One application is to work with online (streaming) data.  If I get  
a new

data point in for code a1, I'll need to be able to reference the
matrix named a1.


So, how do these things you are calling codes get their names?

(code is not an R datatype. a1 is a matrix,  a1 is a character  
value and it would be returned by names(a1).  )


That's not correct. names(a1) would not return a1 but names(codes) 
[1] would if defined as a list as below.




a1 - matrix(rnorm(100), nrow=10)
b24 - matrix(rnorm(100), nrow=10)
q99 - matrix(rnorm(100), nrow=10)

codes - list(a1=a1, b24=b24, q99=q99)

str(codes[['a1']])
... should be a matrix

Assignment also works with [[ or with [.

We really _do_ need  examples that represent the problems posed on  
the list. You have been posting a sufficient number of times to have  
understood this by now.


-- David




On 2/25/11 12:23 AM, David Winsemius wrote:


On Feb 25, 2011, at 1:55 AM, Noah Silverman wrote:

How can I dynamically use a variable as the name for another  
variable?


I realize this sounds cryptic, so an example is best:

#Start with an array of codes



Is there some reason not to use list(a1, b24, q99)? If not then:

lapply(codes, somefun)




#Each code has a corresponding matrix (could be vector)
a1 - matrix(rnorm(100), nrow=10)
b24 - matrix(rnorm(100), nrow=10)
q99 - matrix(rnorm(100), nrow=10)

#Now, I want to loop through all the codes and do something with  
each

matrix
for(code in codes){
 #here is where I'm stuck.  I don't want the value of code, but the
variable who's name is the value of code

}


Any suggestions?

-N



David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] BFGS versus L-BFGS-B

2011-02-25 Thread Brian Tsai
Hi John,

Thanks so much for the informative reply!  I'm currently trying to optimize
~10,000 parameters simultaneously - for some reason, when I compare the
memory usage for L-BFGS-B and BFGS, the L-BFGS-B only uses about 1/7 of the
memory, with all default input parameters, I'm a bit surprised that it isn't
a lot less, but BFGS is definitely converging a lot slower.

My other question is that, L-BFGS-B is returning 'non-finite' errors with
respect to the gradient function I'm supplying, because again, all the
parameters i'm optimizing need to be non-negative (so i'm optimizing the log
of the parameters), but the gradient at some point divides by each
parameter, so when some of the parameters go to 0, the gradient becomes
infinite.  Do you (or anyone else) have any suggestions for how to prevent
this?  Is the only way to force the parameters to be larger than some number
close to 0 (i.e. 1e-10), or modify the gradient function to set the entry of
small parameters to 0?

Thanks!

Brian.


On Fri, Feb 25, 2011 at 10:51 AM, Prof. John C Nash nas...@uottawa.cawrote:

 There are considerable differences between the algorithms. And BFGS is an
 unfortunate
 nomenclature, since there are so many variants that are VERY different. It
 was called
 variable metric in my book from which the code was derived, and that code
 was from Roger
 Fletcher's Fortran VM code based on his 1970 paper. L-BFGS-B is a later and
 more
 complicated algorithm with some pretty nice properties. The code is much
 larger.

 Re: less memory -- this will depend on the number of parameters, but to
 my knowledge
 there are no good benchmark studies of memory and performance. Perhaps
 someone wants to
 propose one for Google Summer of Code (see
 http://rwiki.sciviews.org/doku.php?id=developers:projects:gsoc2011
 ).

 The optimx package can call Rvmmin which has box constraints (also Rcgmin
 that is intended
 for very low memory). Also several other methods with box constraints,
 including L-BFGS-B.
 Worth a try if you are seeking a method for multiple production runs.
 Unfortunately, we
 seem to have some CRAN check errors on Solaris and some old releases --
 platforms I do not
 have -- so it may be a few days or more until we sort out the issues, which
 seem to be
 related to alignment of the underlying packages for which optimx is a
 wrapper.

 Use of transformation can be very effective. But again, I don't think there
 are good
 studies on whether use of box constraints or transformations is better
 and when. Another
 project, which I have made some tentative beginings to carry out.
 Collaborations welcome.

 Best,

 JN


 On 02/25/2011 06:00 AM, r-help-requ...@r-project.org wrote:
  Message: 86
  Date: Fri, 25 Feb 2011 00:11:59 -0500
  From: Brian Tsai btsa...@gmail.com
  To: r-help@r-project.org
  Subject: [R] BFGS versus L-BFGS-B
  Message-ID:
aanlktimszvkjbuhv-bbr1easpx9ootjxqcujgujr5...@mail.gmail.com
  Content-Type: text/plain
 
  Hi all,
 
  I'm trying to figure out the effective differences between BFGS and
 L-BFGS-B
  are, besides the obvious that L-BFGS-B should be using a lot less memory,
  and the user can provide box constraints.
 
  1) Why would you ever want to use BFGS, if L-BFGS-B does the same thing
 but
  use less memory?
 
  2) If i'm optimizing with respect to a variable x that must be
 non-negative,
  a common approach is to do a change of variables x = exp(y), and optimize
  unconstrained with respect to y.  Is optimization using box constraints
 on
  x, likely to produce as good a result as unconstrained optimization on y?
 
  - Brian.
 
[[alternative HTML version deleted]]
 
 
 

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculate probabilty

2011-02-25 Thread Fabrice Tourre
Hi Rex,

Thanks for you explain. In fact, my question is: When I observed that
there are 6 or more balls in one box, what is this probability? The
ball is randomly put into the boxes.
I think it is: 1-pbinom(6,142,1/491) = 2.272026e-08.

When the sample size is large, how should I do this? using chisq.test?
Becuase binom test is not suitable for large sample size.
For example,
There are 6000 balls and 500 boxes, when I observed that there are 60
or more balls in one box, what is this probability?

On Fri, Feb 25, 2011 at 6:40 PM,  rex.dw...@syngenta.com wrote:
 Rex

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA and Pseudoreplication in R

2011-02-25 Thread Bert Gunter
I can hopefully save bandwidth here by suggesting that this belongs on
the R-sig-mixed-models list.

-- Bert

As an aside, shouldn't you be figuring this out yourself or seeking
local consulting expertise?

On Fri, Feb 25, 2011 at 9:08 AM, Ben Ward benjamin.w...@bathspa.org wrote:
 Hi, As part of my dissertation, I'm going to be doing an Anova, comparing
 the dead zone diameters on plates of microbial growth with little paper
 disks loaded with antimicrobial, a clear zone appears where death occurs,
 the size depending on the strength and succeptibility. So it's basically 4
 different treatments, and I'm comparing the diameters (in mm) of circles.
 I'm concerned however, about Pseudoreplication and how to deal with it in R,
 (I thought of using the Error() term.

 I have four levels of one factor(called Treatment): NE.Dettol, EV.Dettol,
 NE.Garlic, EV.Garlic.   (NE.Dettol is E.coli not evolved to dettol,
 exposed to dettol to get dead zones. And the same for NE.Garlic, but with
 garlic, not dettol. EV.Dettol is E.coli that has been evolved against
 dettol, and then tested afterwards against dettol to get the dead zones.
 Same applies for EV.Garlic but with garlic).  You see from the four levels
 (or treatments) there are two chemicals involved. So my first concern is
 whether they should be analysed using two seperate ANOVA's.

 NE.Dettol and NE.Garlic are both the same organism - a lab stock E.coli,
 just exposed to two different chemicals.
 EV.Dettol and EV.Garlic, are in principle, likely to be two different forms
 of the organism after the many experimental doses of their respective
 chemical.

 For NE.Garlic and NE.Dettol I have 5, what I've called Lineages, basically
 seperate bottles of them (10 in total).
 Then I have 5 Bottles (Lineages) of EV.Dettol, and 5 of EV.Garlic. - This
 was done because there was the possiblity that, whilst I'm expecting them
 all to respond in a similar manner, there are many evolutionary paths to the
 same result, and previous research and reading shows that occasionally one
 or two react differently to the rest through random chance.
 The point I observed above (NE.Dettol and NE.Garlic are both the same
 organism...) is also applicable to the 5 bottles: The 5 bottles each of
 NE.Garlic and NE.Dettol are supposed to be all the same organism - from a
 stock one kept in store in the lab.
 There is potential though for the 5 of EV.Garlic, to be different from one
 another, and potential for the 5 EV.Dettol to be different from one another.

 The Lineage (bottle) is also a factor then, with 5 levels (1,2,3,4,5).
 Because they may be different.

 To get the measurements of the diamter of the zones. I take out a small
 amount from a tube and spread it on a plate, then take three paper disks,
 soaked in their respective chemical, either Dettol or Garlic. and press them
 and and incubate them.
 Then when the zones have appeared after a day or 2. I take 4 diameter
 measurements from each zone, across the zone at different angles, to take
 account for the fact, that there may be a weird shape, or not quite
 circular.

 I'm concerned about pseudoreplication, such as the multiple readings from
 one disk, and the 5 lineages - which might be different from one another in
 each of the Two EV. treatments, but not with NE. treatments.

 I read that I can remove pseudoreplication from  the multiple readings from
 each disk, by using the 4 readings on each disk, to produce a mean for the
 disks, and analyse those means - Exerciseing caution where there are extreme
 values. I think the 3 disks for each lineage themselves are not
 pseudoreplication, because they are genuinley 3 disks on a plate: the Disk
 Diffusion Test replicated 3 times - but the multiple readings from one disk
 if eel, is pseudoreplication. I've also read about including Error() terms
 in a formula.

 I'm unsure of the two NE. Treatments comming from the same culture does not
 introduce pseudoreplications at Treatment Factor Level, because of the two
 different antimicrobials used have two different effects.

 I was hoping for a more expert opinion on whether I have identified
 pseudoreplication correctly or if there is indeed pseudoreplication in the 5
 Lineages or anywhere else I haven't seen. And how best this is dealt with in
 R. At the minute my solution to the multiple readings from one disk is to
 simply make a new factor, with the means on and do Anova from that, or even
 take the means before I even load the dataset into R. I'm wondering if an
 Error() term would be correct.

 Thanks,
 Ben W.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Bert Gunter
Genentech Nonclinical Biostatistics

__
R-help@r-project.org mailing list

[R] R in different OS

2011-02-25 Thread Hui Du
Hi All,

I have two Rs, one has been installed in Windows system and 
another one has been installed under UNIX system. Is there any environmental 
variable or function to tell me which R I am using? The reason that I need to 
know it is under different system, the data path could be different. I want to 
do something like

if it is R under Windows

path = /ABC
else if it is R under UNIX,
path = /DEF

Any idea? Thanks.

Best Regards,

HXD

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] means, SD's and tapply

2011-02-25 Thread Christopher R. Dolanc
I'm trying to use tapply to output means and SD or SE for my data but 
seem to be limited by how many times I can subset it.  Here's a snippet 
of my data

  stems353[1:10,]
  Time DataSource   Plot Elevation Aspect Slope Type Species 
SizeClass Stems
1  ModernCameron 70F221  1730ESE20  ConiferABCO
Class1 3
2  ModernCameron 70F221  1730ESE20  ConiferABMA
Class1 0
3  ModernCameron 70F221  1730ESE20 HardwoodACMA
Class1 0
4  ModernCameron 70F221  1730ESE20 HardwoodAECA
Class1 0
5  ModernCameron 70F221  1730ESE20 HardwoodARME
Class1 0
6  ModernCameron 70F221  1730ESE20  ConiferCADE
Class115
7  ModernCameron 70F221  1730ESE20 HardwoodCELE
Class1 0
8  ModernCameron 70F221  1730ESE20 HardwoodCONU
Class1 0
9  ModernCameron 70F221  1730ESE20  ConiferJUCA
Class1 0
10 ModernCameron 70F221  1730ESE20  ConiferJUOC
Class1 0

I'd like to see means/SD of Stems stratified by Species, Time and 
SizeClass.  I can get R to give me this for means by species:

  tapply(stems353$Stems, stems353$Species, mean)
 ABCO ABMA ACMA AECA 
ARME CADE CELE
0.7305240793 0.8569405099 0.0003541076 0.0010623229 0.0017705382 
0.4684844193 0.0063739377
 CONU JUCA JUOC LIDE 
PIAL PICO PIJE
0.0017705382 0.0003541076 0.0959631728 0.0138101983 0.3905807365 
1.5651558074 0.2315864023
 PILA PIMOPIMO2 PIPO 
PISA POTR PSME
0.1774079320 0.1880311615 0.0311614731 0.6735127479 0.0237252125 
0.0506373938 0.2000708215
 QUCH QUDO QUDU QUKE 
QULO QUWISalix
0.0474504249 0.1203966006 0.00 0.2071529745 0.0003541076 
0.0548866856 0.0003541076
 SEGI TSME
0.0021246459 0.5017705382
 

but I really need to see each species by SizeClass and Time so that each 
value would be labeled something like ABCOSizeClass1TimeModern.  
Adding 2 variables to the function doesn't seem to work

  tapply(stems353$Stems, stems353$Species, stems353$SizeClass, 
stems353$Time, mean)
Error in match.fun(FUN) :
   'stems353$SizeClass' is not a function, character or symbol

I've already created proper subsets for each of these groups, e.g. one 
subset is called stems353ABCO1 and I can run analyses on this.  But, 
trying to extract means straight from those subsets doesn't seem to work

  mean(stems353ABCO1)
[1] NA
Warning message:
In mean.default(stems353ABCO1) :
   argument is not numeric or logical: returning NA
 

Thanks,
Chris Dolanc

-- 
Christopher R. Dolanc
PhD Candidate
Ecology Graduate Group
University of California, Davis
Lab Phone: (530) 752-2644 (Barbour lab)


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Group rows by common ID and plot?

2011-02-25 Thread DB1984

Thanks Mike - this doesn't quite do it, but I think that you've hit of the
right method.

I am just trying to use 'plot' initially - I don't care so much about the
arrangement in the file.

plot(df$y,group=df$f) outputs the Y column in the appropriate plot. What I
would like to do is have 10 Y columns, i.e. Y1, Y2, Y3...Y10, and plot just
the values in each row that is grouped by 'f'. Does this make sense?


I'm looking at 'group' functions that allow extracting of all the values of
all rows that match a unique 'f', and then trying to plot them individually,
but not working yet.

Any other suggestions for group functions that might allow the data to be
reshaped into appropriate lists?


Scott - I think that the main issue is the upfront grouping of all columns
within a row, rather than the faceting...


Thanks.
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Group-rows-by-common-ID-and-plot-tp3321955p3325121.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with card-sorting experiment

2011-02-25 Thread Steven Wolf
This is the first time that I've posted to this list, so if I'm doing
something wrong, please let me know.  Also, if there is a searchable forum
where I can find help, that would be good too.

 

I'm doing a card sorting experiment, and I'm having problems imputing my
data into R for later analysis.  This is what I get from each sorter:

Category NameCard numbers

Group 1:  1,3,5,7,9

Group 2:  2,4,6,8,10

 

I would like to use the adjusted Rand index to compare each sorter.  As an
input the ARI needs a vector like this:

 

Group 1 Group 2 Group 1 Group 2 Group 1 Group 2 Group 1 Group
2 Group 1 Group 2 

 

or even.   1 2 1 2 1 2 1 2 1 2

 

(Note:  I am using the function adjustedRandIndex from the library mclust)

 

I also need to make a similarity matrix so that I can do cluster analysis on
the reviewers (which I am able to do).

 

I would like to be able to load in several of these vectors at once so that
I can create a big matrix with all of the data in it, and I currently do
this using read.table()

 

PROBLEM 1)

When I attempt to do the Adjusted Rand Index calculation I do this:

reviewer1 - read.table(filename, sep=,) (they are csv files)

.

 

Then I try to make a big matrix by doing this:

rset[1] - reviewer1

rset[2] - reviewer2 .

 

So when I try to do 

adjustedRandIndex(rset[1],rset[2]) 

I get an error message:

Error in FUN(X, Y, ...) : comparison of these types is not implemented

 

And If I do this:

 rset[1]

[[1]]

[1] set1 set2 set1 set2 set1 set2 set1 set2 set1 set2

Levels: set1 set2

 

PROBLEM 2)

Some of my reviewers have put single cards into two piles (which was
allowed).  However, this routine for the Adjusted Rand Index doesn't seem to
be able to handle that sort of category as an input.  Is that a problem for
the Adj Rand Index in general?  Is there a routine that can find the
Adjusted Rand Index for a different input?

 

Thanks,

-Steve Wolf

MSU--Lyman Briggs College


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Interactive/Dynamic plots with R

2011-02-25 Thread Greg Snow
What types of interaction do you want?

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Abhishek Pratap
 Sent: Friday, February 25, 2011 12:37 AM
 To: r-help@r-project.org
 Subject: [R] Interactive/Dynamic plots with R
 
 Hi Guys
 
 In order to look at a dense plot I would like to have the capability
 to plot dynamic/interactive. Before I try rgobi which I heard can help
 me; I would like to take your opinion.
 
 Thanks!
 -Abhi
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error

2011-02-25 Thread rex.dwyer
Does it work for FUN=mean?  If yes, you need to print out the results of f 
before you return them to find the anomalous value.
BTW Error is not a very good subject line.  I don't see many posts from 
people reporting how well things are going :)


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of mathijsdevaan
Sent: Friday, February 25, 2011 9:31 AM
To: r-help@r-project.org
Subject: [R] Error


Hi, I am running the following script for a different (much larger data
frame):

DF = data.frame(read.table(textConnection(A  B  C  D  E
1 1  a  1999  1  0
2 1  b  1999  0  1
3 1  c  1999  0  1
4 1  d  1999  1  0
5 2  c  2001  1  0
6 2  d  2001  0  1
7 3  a  2004  0  1
8 3  b  2004  0  1
9 3  d  2004  0  1
10 4  b  2001  1  0
11 4  c  2001  1  0
12 4  d  2001  0  1),head=TRUE,stringsAsFactors=FALSE))
DF-DF[order(DF$B,DF$C),]#order by developer_id and year
f- function(x)
{
unlist(lapply(x, FUN = function(z) cumsum(z) - z))
}
DF-cbind(DF[,c(1:3)],ave(DF[, c(4:5)],DF$B, FUN = f))

I get the following error:

Error in `[-.data.frame`(`*tmp*`, i, , value = integer(0)) :
  replacement has 0 items, need 37597770
In addition: Warning message:
In max(i) : no non-missing arguments to max; returning -Inf

The dimensions of the data frame are (5,108), so the last line of the
script becomes:

DF-cbind(DF[,c(1:3)],ave(DF[, c(4:108)],DF$B, FUN = f))

Any idea how to solve this problem? Thanks!


--
View this message in context: 
http://r.789695.n4.nabble.com/Error-tp3324531p3324531.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated 
recipient, please notify the sender immediately, and delete the original and 
any copies. Any use of the message by you is prohibited. 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lme in loop help

2011-02-25 Thread Ram H. Sharma
Dear R users

I am new R user, execuse me I bother you, but I worked hard to find a
solution:

# data
ID - c(1:100)
set.seed(21)
y - rnorm(100, 10,2)
x1 - rnorm(100, 10,2)
x2 - rnorm(100, 10,2)
x3 - rnorm(100, 10,2)
x4 - rnorm(100, 10,2)
x5 - rnorm(100, 10,2)
x6 - rnorm(100, 10,2)
mydf - data.frame(ID,y, x1,x2, x3, x4, x5, x6)
# just seperate analyis
require(nlme)
mod1- lme(fixed= y ~ x1 + x2, random = ~ 1 | ID) # i want to put subject /
ID as random is it correct??
m1 - anova(mod1)
m1 [1,4] # I am not getting exact value below .0001???, how to get one?
m1 [2,4]
m1 [3,4]
# putting in a loop
for(i in length(mydf)){
  mylme - NULL
  print(m1[i+1,]- lme(fixed= mydf$y ~ mydf$x[,i] + mydf$x[,i+1], random = ~
1 | ID))}

could not help myself to work ! However I have the following output in my
mind
# The output in my mind a data fram with the following
model   p-intercept  variable1   p-value1   variable2  p-value2
1.0001   x10.9452 x2 0.5455
2   .0001x30.3301  x 4   0.9905
3  .0001  x50.9971 x60.0487

I need a solution as I have tons of variables to work on. I am sowing these
six just for an example.

Thank you in advance;

ram sharma

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in different OS

2011-02-25 Thread Jorge Ivan Velez
Hi Hui,

May be sessionInfo() is what you are looking for. See ?sessionInfo as well
as ?version for more details. You can run the following on your R session
and see what comes up:

sessionInfo()
sessionInfo()$R.version$platform
version$platform

Then, you might use ifelse() to set up the right path.

HTH,
Jorge


On Fri, Feb 25, 2011 at 1:23 PM, Hui Du  wrote:

 Hi All,

I have two Rs, one has been installed in Windows system and
 another one has been installed under UNIX system. Is there any environmental
 variable or function to tell me which R I am using? The reason that I need
 to know it is under different system, the data path could be different. I
 want to do something like

 if it is R under Windows

path = /ABC
 else if it is R under UNIX,
path = /DEF

 Any idea? Thanks.

 Best Regards,

 HXD

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in different OS

2011-02-25 Thread Marc Schwartz
On Feb 25, 2011, at 12:23 PM, Hui Du wrote:

 Hi All,
 
I have two Rs, one has been installed in Windows system and 
 another one has been installed under UNIX system. Is there any environmental 
 variable or function to tell me which R I am using? The reason that I need to 
 know it is under different system, the data path could be different. I want 
 to do something like
 
 if it is R under Windows
 
path = /ABC
 else if it is R under UNIX,
path = /DEF
 
 Any idea? Thanks.
 
 Best Regards,
 
 HXD


See ?.Platform, more specifically:

On Unixen (eg. Linux, OSX)

 .Platform$OS.type
[1] unix

and on Windows, will be windows.

If needed, look at the additional functions listed in the See Also on the help 
page (eg. ?Sys.info, etc.).

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in different OS

2011-02-25 Thread Ista Zahn
Hi,

see ?R.version

Something like
if(version$os == mingw32) {
   path = /ABC} else {
   path = /DEF
}

might do it, but I'm not sure exactly what possible values version$os
can take or what determines the value exactly.

Best,
Ista


On Fri, Feb 25, 2011 at 1:23 PM, Hui Du hui...@dataventures.com wrote:
 Hi All,

                I have two Rs, one has been installed in Windows system and 
 another one has been installed under UNIX system. Is there any environmental 
 variable or function to tell me which R I am using? The reason that I need to 
 know it is under different system, the data path could be different. I want 
 to do something like

 if it is R under Windows

                path = /ABC
 else if it is R under UNIX,
                path = /DEF

 Any idea? Thanks.

 Best Regards,

 HXD

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in different OS

2011-02-25 Thread David Winsemius


On Feb 25, 2011, at 1:23 PM, Hui Du wrote:


Hi All,

   I have two Rs, one has been installed in Windows  
system and another one has been installed under UNIX system. Is  
there any environmental variable or function to tell me which R I am  
using? The reason that I need to know it is under different system,  
the data path could be different. I want to do something like


if it is R under Windows

   path = /ABC
else if it is R under UNIX,
   path = /DEF


?version

--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >