Re: [R] [R-SIG-Mac] morley object?

2013-11-22 Thread Steven McKinney
Hi Ann

Looks like a typo - I see "moreley.tab" that should be "morley.tab".
Works for me after correcting that.

> filepath <- system.file("data", "moreley.tab", package="datasets")
> filepath
[1] ""
> 
> filepath <- system.file("data", "morley.tab", package="datasets")
> filepath
[1] "C:/PROGRA~1/R/R-30~1.1/library/datasets/data/morley.tab"
> mm <- read.table(filepath)
> head(mm)
Expt Run Speed
001    1   1   850
0021   2   740
0031   3   900
0041   4  1070
0051   5   930
0061   6   850
> 


Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre





> -Original Message-
> From: r-sig-mac-boun...@r-project.org [mailto:r-sig-mac-bounces@r-
> project.org] On Behalf Of Ann Summy
> Sent: November-22-13 4:56 PM
> To: r-sig-mac; r-help
> Subject: [R-SIG-Mac] morley object?
> 
> Brand new to R and not a sys admin or programmer type.  I am trying to work
> through the intro tutorial in the sample session here
> http://cran.r-project.org/doc/manuals/R-intro.html#A-sample-session.
> 
> I have an error when following the instructions below:
> 
> > The next section will look at data from the classical experiment of
> > Michelson to measure the speed of light. This dataset is available in the
> > morley object, but we will read it to illustrate the read.table function.
> > filepath <- system.file("data", "morley.tab" ,
> package="datasets")filepath
> >
> > Get the path to the data file.
> > file.show(filepath)
> >
> > Optional. Look at the file.
> > mm <- read.table(filepath)mm
> >
> > Read in the Michelson data as a data frame, and look at it. There are
> five
> > experiments (column Expt) and each has 20 runs (column Run) and sl is the
> > recorded speed of light, suitably coded.
> >
> Here is my result:
> 
> > > filepath <- system.file("data", "moreley.tab", package="datasets")
> 
> > filepath
> 
> [1] ""
> 
> > file.show(filepath)
> 
> > mm <- read.table(filepath)
> 
> Error in read.table(filepath) : no lines available in input
> 
> In addition: Warning message:
> 
> In file(file, "rt") :
> 
>   file("") only supports open = "w+" and open = "w+b": using the former
> 
> 
> Is it because I am missing this "morley object"?  How do I see a list of
> what objects I have installed?  I installed
> R-2.15.3.pkg<http://cran.r-project.org/bin/macosx/old/R-2.15.3.pkg>,
> because
> I am teaching myself on a 32-bit Macbook.
> 
>   [[alternative HTML version deleted]]
> 
> ___
> R-SIG-Mac mailing list
> r-sig-...@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-mac

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Change the text size of the title in a legend of a R plot.

2011-04-28 Thread Steven McKinney



> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf Of Victor Gabillon
> Sent: April-28-11 8:22 PM
> To: r-help@r-project.org
> Subject: [R] Change the text size of the title in a legend of a R plot.
> 
> Hello,
> 
> Is it possible to change the text size of the title in a legend of a R plot?
> 
> I tried to directly change the title.cex argument but it seems not to work.
> 
> Trying :
> 
> Horizo <- c(1,2,6,10,20)
> legtext <- paste(Horizo,sep="")
> legend("topleft", legend=legtext,col=col,text.col=col,lwd=lwd,
> lty=lty,cex=1.1,ncol=3,title = "Horizons",title.col ="black",title.cex=1.4)

I haven't found any cex argument that works for just the legend title, but you
can get some modification of the title with the expression argument:

legend("topleft", legend=legtext,col=col,text.col=col,lwd=lwd, 
lty=lty,cex=1.1,ncol=3,
title = expression(bold("Horizons")),title.col="black")

Does that help?

Otherwise, you can of course figure out which functions do the legend plotting,
copy and modify those to get a title cex in place.

Steve McKinney



> 
> gives the following error (sorry in french):
> Erreur dans legend("topleft", legend = legtext, col = col, text.col =
> col,  :
>argument(s) inutilisé(s) (title.cex = 1.4)
> 
> saying title.cex argument as been ignored.
> 
> Thank you for helping.
> 
> Victor
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] is this an ANOVA ?

2011-04-12 Thread Steven McKinney

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf Of Ubuntu Diego
> Sent: April-12-11 5:10 PM
> To: r-help@r-project.org
> Subject: [R] is this an ANOVA ?
> 
> Hi all,
>   I have a very easy questions (I hope). I had measure a property of 
> plants, growing in three
> different substrates (A, B and C). The rest of the conditions remained 
> constant. There was very high
> variation on the results.
>   I want to do address, whether there is any difference in the response 
> (my measurement) from
> substrate to substrate?
> 
> x<-c('A','A','A','A','A','B','B','B','B','B','C','C','C','C','C') # Substrate 
> type
> y <- c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15) # Results of the measurement
> MD<-data.frame(x,y)
> 
>   I wrote a linear model for this:
> 
> summary(lm(y~x,data=MD))
> 
>   This is the output:
> 
> Call:
> lm(formula = y ~ x, data = MD)
> 
> Residuals:
>Min 1Q Median 3QMax
> -2.000e+00 -1.000e+00  5.551e-17  1.000e+00  2.000e+00
> 
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)
> (Intercept)   3. 0.7071   4.243 0.001142 **
> xB5. 1.   5.000 0.000309 ***
> xC   10. 1.  10.000 3.58e-07 ***
> ---
> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> 
> Residual standard error: 1.581 on 12 degrees of freedom
> Multiple R-squared: 0.8929,   Adjusted R-squared: 0.875
> F-statistic:50 on 2 and 12 DF,  p-value: 1.513e-06
> 
>   I conclude that there is an effect of substrate type (x).
>   NOW the questions :
>   1) Do the fact that the all p-values are significant means that 
> all the groups are
> different from each other ?

No, the small p-values indicate that the associated estimate appears to be 
significantly different from zero.

You can use the package "multcomp" to do multiple comparisons.

 > require("multcomp")
 > lma <- aov(y ~ x, data = MD)
 > lmamc <- glht(lma, linfct = mcp(x = "Tukey"))
 > ci.lma <- confint(lmamc)
 > ci.lma

  Simultaneous Confidence Intervals

 Multiple Comparisons of Means: Tukey Contrasts
 
 
 Fit: aov(formula = y ~ x, data = MD)

 Quantile = 2.667
 95% family-wise confidence level
 

 Linear Hypotheses:
Estimate lwr upr
 B - A == 0  5.   2.3330  7.6670
 C - A == 0 10.   7.3330 12.6670
 C - B == 0  5.   2.3330  7.6670

 > lmacld <- cld(lmamc)
 > plot(lmacld)
 > plot(ci.lma)
 



>   2) Is there a (easy) way to plot,  mean plus/minus 2*sd for 
> each substrate type ? (with
> asterisks denoting significant differences ?)

Not that I know of, though the multcomp tables and plots yield logically 
equivalent results and plots.
Writing a few lines of code to accomplish your graph is fairly straightforward.

HTH

Steven McKinney


> 
> 
>   THANKS !
> 
> version
> platform   x86_64-apple-darwin9.8.0
> arch   x86_64
> os darwin9.8.0
> system x86_64, darwin9.8.0
> status
> major  2
> minor  11.1
> year   2010
> month  05
> day31
> svn rev52157
> language   R
> version.string R version 2.11.1 (2010-05-31)
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fast version of Fisher's Exact Test

2011-04-11 Thread Steven McKinney

Depends on how many other programs, and how large they are, and how much
RAM you have on your machine.  If I repeatedly run the example I used below, 
my R session shows 170MB of memory usage, not a huge amount relative to total
memory, and not a huge amount even for 32 bit R.  But if your system has 2 GB 
of RAM and 1.9 GB is consumed by other processes, then this example will cause 
swapping and speed will be reduced.

So figuring out a solution requires understanding what it is that is causing
the slowdown - not enough RAM, other programs competing for CPU cycles...

You can try switching to 64 bit R but unless your 32 bit R is loading some large
data objects, leaving little RAM, you won't see much of a difference.
If you start R, and do rm(list = ls()) to ensure no big data objects are using 
up
RAM, does the example below still take a long time?

You haven't mentioned what operating system you are using, how much RAM you have
or what sessionInfo() reports on your machine.  That information will help to
figure this out.



Steven McKinney


From: Jim Silverton [jim.silver...@gmail.com]
Sent: April 9, 2011 9:21 AM
To: Steven McKinney
Subject: Re: [R] Fast version of Fisher's Exact Test

I R 32 bit installed but my machine is 64 bit. Do I need to upgrade the R to 64 
bit for it to run faster?



On Fri, Apr 8, 2011 at 6:44 PM, Steven McKinney 
mailto:smckin...@bccrc.ca>> wrote:
Do you mean a test something such as this?

> fisher.test(matrix(c(502,498,490, 510), nrow = 2))

   Fisher's Exact Test for Count Data

data:  matrix(c(502, 498, 490, 510), nrow = 2)
p-value = 0.6228
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
 0.8770113 1.2550998
sample estimates:
odds ratio
 1.049119

This runs quickly on my machine.

> system.time(fisher.test(matrix(c(502,498,490, 510), nrow = 2)))
  user  system elapsed
 0.008   0.001   0.010
> sessionInfo()
R version 2.12.2 (2011-02-25)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_CA.UTF-8/en_CA.UTF-8/C/C/en_CA.UTF-8/en_CA.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] tools_2.12.2
>

Can you provide an example that is running slowly for you?

Steven McKinney


From: r-help-boun...@r-project.org<mailto:r-help-boun...@r-project.org> 
[r-help-boun...@r-project.org<mailto:r-help-boun...@r-project.org>] On Behalf 
Of Jim Silverton [jim.silver...@gmail.com<mailto:jim.silver...@gmail.com>]
Sent: April 8, 2011 9:43 AM
To: r-help@r-project.org<mailto:r-help@r-project.org>
Subject: Re: [R] Fast version of Fisher's Exact Test

Is anyone aware of a fast way of doing fisher's exact test for a series of 2
x 2 tables in R? The fisher.test is really slow if n1=1000 and n2 = 1000.

--
Thanks,
Jim.

   [[alternative HTML version deleted]]

__
R-help@r-project.org<mailto:R-help@r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Thanks,
Jim.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fast version of Fisher's Exact Test

2011-04-08 Thread Steven McKinney
Do you mean a test something such as this?

> fisher.test(matrix(c(502,498,490, 510), nrow = 2))

Fisher's Exact Test for Count Data

data:  matrix(c(502, 498, 490, 510), nrow = 2) 
p-value = 0.6228
alternative hypothesis: true odds ratio is not equal to 1 
95 percent confidence interval:
 0.8770113 1.2550998 
sample estimates:
odds ratio 
  1.049119 

This runs quickly on my machine.

> system.time(fisher.test(matrix(c(502,498,490, 510), nrow = 2)))
   user  system elapsed 
  0.008   0.001   0.010 
> sessionInfo()
R version 2.12.2 (2011-02-25)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_CA.UTF-8/en_CA.UTF-8/C/C/en_CA.UTF-8/en_CA.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 

loaded via a namespace (and not attached):
[1] tools_2.12.2
> 

Can you provide an example that is running slowly for you?

Steven McKinney


From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
Jim Silverton [jim.silver...@gmail.com]
Sent: April 8, 2011 9:43 AM
To: r-help@r-project.org
Subject: Re: [R] Fast version of Fisher's Exact Test

Is anyone aware of a fast way of doing fisher's exact test for a series of 2
x 2 tables in R? The fisher.test is really slow if n1=1000 and n2 = 1000.

--
Thanks,
Jim.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Where is the tcltk package?

2011-04-07 Thread Steven McKinney
D'oh!

Sorry for that noise.

You also need bwidget and Tktable installed on your linux system,
so you may need to reinstall R after getting bwidget and other
tcl/tk bits installed.  Are you using a tool such as "yum" to install
linux software, or installing yourself from source or binaries?



Steven McKinney


From: Rolf Turner [r.tur...@auckland.ac.nz]
Sent: April 7, 2011 5:38 PM
To: Steven McKinney
Cc: r-help@r-project.org
Subject: Re: [R] Where is the tcltk package?

On 08/04/11 12:31, Steven McKinney wrote:
> Searching for tcltk on the BioConductor site yields
>
> CRAN - Package tcltk2 - http://cran.r-project.org/web/packages/tcltk2/
>  CRAN - Package tcltk2 tcltk2: Tcl/Tk Additions A series of additional 
> Tcl commands and Tk widgets with style and various functions (under Windows: 
> DDE exchange, access to the registry
>
> If you install tcltk2, is tcltk then available?
>
I ***can't*** install tcltk2, as I said in my original post.  It
requires that tcltk be available.

 cheers,

 Rolf Turner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Where is the tcltk package?

2011-04-07 Thread Steven McKinney
Searching for tcltk on the BioConductor site yields

CRAN - Package tcltk2 - http://cran.r-project.org/web/packages/tcltk2/
CRAN - Package tcltk2 tcltk2: Tcl/Tk Additions A series of additional Tcl 
commands and Tk widgets with style and various functions (under Windows: DDE 
exchange, access to the registry

If you install tcltk2, is tcltk then available?  

Steven McKinney


From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
Rolf Turner [r.tur...@auckland.ac.nz]
Sent: April 7, 2011 5:15 PM
To: r-help@r-project.org
Subject: [R] Where is the tcltk package?

Perhaps I'm being even thicker than usual, but I can't find
the tcltk package on CRAN.  There is a tcltk2 package, which says
that it is a collection of supplements to tcltk, but I cannot
see a just-plain tcltk anywhere.

If I try to install tcltk2 (from the Linux command line, or using
install.packages() in R) it complains that it needs tcltk.

If I try to install tcltk using install.packages() it tells me that
this package is not available.

What is going on?

 cheers,

 Rolf Turner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANCOVA for linear regressions without intercept

2011-04-04 Thread Steven McKinney
Hi Yusuke,

Does the following get what you are after?

### Make some test data.
> set.seed(123)
> edf <- data.frame(sex = c(rep("Male", 10), rep("Female", 10), rep("Unknown", 
> 10)),
+   head_length = c(1.2 * c(170:179 + rnorm(10)), 0.8 * 
c(150:159 + rnorm(10)), c(160:169 + rnorm(10)))/10,
+   body_length = c(c(170:179 + rnorm(10)), c(150:159 + 
rnorm(10)), c(160:169 + rnorm(10)))
+   )
> edf$sex <- factor(as.character(edf$sex))
> plot(edf$head_length, edf$body_length, pch = as.numeric(edf$sex), col = 
> as.numeric(edf$sex), xlim = c(0, 25), ylim = c(0, 190))
> lmf <- lm(body_length ~ head_length * sex, data = edf)


### The full model - do keep an eye on those intercepts and try to ensure they 
are not far from 0.
> summary(lmf)

Call:
lm(formula = body_length ~ head_length * sex, data = edf)

Residuals:
 Min   1Q   Median   3Q  Max 
-2.73783 -0.68133  0.02147  0.50858  2.38931 

Coefficients:
   Estimate Std. Error t value Pr(>|t|)
(Intercept)  -3.578 25.425  -0.141   0.8893
head_length  12.772  2.054   6.2182e-06 ***
sexMale  15.122 37.464   0.404   0.6901
sexUnknown   40.308 33.137   1.216   0.2357
head_length:sexMale  -4.977  2.438  -2.042   0.0523 .  
head_length:sexUnknown   -4.971  2.428  -2.047   0.0517 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Residual standard error: 1.384 on 24 degrees of freedom
Multiple R-squared: 0.9802, Adjusted R-squared: 0.9761 
F-statistic: 237.7 on 5 and 24 DF,  p-value: < 2.2e-16 

### Now suppress intercepts.  head_length:sex should give interactions (slopes) 
only.
> lmrf <- lm(body_length ~ -1 + head_length : sex, data = edf)
> summary(lmrf)

Call:
lm(formula = body_length ~ -1 + head_length:sex, data = edf)

Residuals:
 Min   1Q   Median   3Q  Max 
-3.02782 -0.61861 -0.01079  0.68785  2.57544 

Coefficients:
   Estimate Std. Error t value Pr(>|t|)
head_length:sexFemale  12.482530.03549   351.8   <2e-16 ***
head_length:sexMale 8.345000.02097   398.0   <2e-16 ***
head_length:sexUnknown 10.038440.02677   375.0   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Residual standard error: 1.389 on 27 degrees of freedom
Multiple R-squared: 0., Adjusted R-squared: 0. 
F-statistic: 1.409e+05 on 3 and 27 DF,  p-value: < 2.2e-16 

### Check the numeric coding of the factor
> with(edf, table(sex, as.numeric(sex)))
 
sex1  2  3
  Female  10  0  0
  Male 0 10  0
  Unknown  0  0 10

> abline(a = 0, b = coef(lmrf)[1], col = 1) ## Females = Black
> abline(a = 0, b = coef(lmrf)[2], col = 2) ## Males = Red
> abline(a = 0, b = coef(lmrf)[3], col = 3) ## Unknown = Green

### If no diff between males and females, then males and females can be 
combined into one group.
> edf$MvF <- as.character(edf$sex)
> edf$MvF[edf$MvF != "Unknown"] <- "MorF"
> edf$MvF <- factor(edf$MvF)
> with(edf, table(MvF, sex))
 sex
MvF   Female Male Unknown
  MorF10   10   0
  Unknown  00  10
> lmr1f  <- lm(body_length ~ -1 + head_length : MvF, data = edf)
> summary(lmr1f)

Call:
lm(formula = body_length ~ -1 + head_length:MvF, data = edf)

Residuals:
Min  1Q  Median  3Q Max 
-23.976 -21.656   0.077  35.899  39.839 

Coefficients:
   Estimate Std. Error t value Pr(>|t|)
head_length:MvFMorF  9.4156 0.3429   27.46   <2e-16 ***
head_length:MvFUnknown  10.0384 0.5085   19.74   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Residual standard error: 26.39 on 28 degrees of freedom
Multiple R-squared: 0.9761, Adjusted R-squared: 0.9744 
F-statistic: 571.9 on 2 and 28 DF,  p-value: < 2.2e-16 

### Test the hypothesis that male and female heights are equivalent
> anova(lmr1f, lmrf)
Analysis of Variance Table

Model 1: body_length ~ -1 + head_length:MvF
Model 2: body_length ~ -1 + head_length:sex
  Res.Df RSS Df Sum of Sq FPr(>F)
1 28 19496.1 
2 2752.1  1 19444 10077 < 2.2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

### Plot the reduced model regression lines
> abline(a = 0, b = coef(lmr1f)[1], col = "blue", lty = 2)
> abline(a = 0, b = coef(lmr1f)[2], col = "orange", lty = 2, lwd = 4)
> 

The other two tests can be set up and run similarly.  Don't
forget to adjust fo

Re: [R] Linear Model with curve fitting parameter?

2011-04-04 Thread Steven McKinney


> -Original Message-
> From: stephen sefick [mailto:ssef...@gmail.com]
> Sent: April-04-11 2:49 PM
> To: Steven McKinney
> Subject: Re: [R] Linear Model with curve fitting parameter?
>
> Steven:
>
> I am really sorry for my confusion.  I hope this now makes sense.
>
> b0 == y intercept == y-intercept == (intercept) fit by lm
>
> a <- 1:10
> b <- 1:10
>
> summary(lm(a~b))
> #to show what I was calling b0
>
> So...
>
> 
> manning
>
> Q = K*A*(R^b2)*(S^b3)
>
> log(Q) = log(K)+log(A)+(b2*log(R))+(b3*log(S))

Okay, using this notation, this appears to be the original
model you queried about.  So for this model, as I showed
before,

Let Z = log(Q) - log(A)

E(Z) = b0 + b2*log(R) + b3*log(S)
 = log(K) + b2*log(R) + b3*log(S)

Fitting the model  lm(Z ~ log(R) + log(S))
will yield parameter estimates b_hat_0, b_hat_2, b_hat_3
where
b_hat_0 (the fitted model intercept) is an estimate of b0 (which is log(K)),
b_hat_2 is an estimate of b2,
b_hat_3 is an estimate of b3.

So in answer to your previous question, b0 is an
estimate of log(K), not ( log(Qintercept)+log(K) )
so an estimate for K is exp(b_hat_0)


>
> 
> dingman
> Q = K*(A^b1)*(R^b2)*(S^b3*log(S))
>
> log(Q) = log(K)+(b1*log(A))+(b2*log(R))+(b3*(log(S))^2)

The dingman model notation is ambiguous.  Is the last
term  S^(b3*log(S))  or  (S^b3)*log(S) ?

Previous email showed

   > dingman
   > log(Q)=log(b0)+log(K)+a*log(A)+r*log(R)+s*(log(S))^2

which implies (if I ignore the log(b0) term)
  Q = K*(A^a)*(R^r)*(exp(log(S)*log(S))^s)
= K*(A^a)*(R^r)*(S^(log(S)*s))

This is linearizable as

log(Q) = log(K) + a*log(A) + r*log(R) + s*(log(S))^2
   = b0 + b1*log(A) + b2*log(R) + b3*(log(S)^2)

Fitting lm(log(Q) ~ log(A) + log(R) + I(log(S)^2) ... )
will yield estimates b_hat_0, b_hat_1, b_hat_2 and b_hat_3
where b_hat_0 is an estimate of b0 = log(K) so an estimate of K is exp(b_hat_0),
b_hat_1 is an estimate of b1 = a,
b_hat_2 is an estimate of b2 = r,
b_hat_3 is an estimate of b3 = s



>
> 
>
> Bjerklie
>
> Q = K*(A^b1)*(R^b2)*(S^b3)
>
> log(Q) = log(K)+(b1*log(A))+(b2*log(R))*(b3*log(S))

Fitting lm(log(Q) ~ log(A) + log(R) + log(S) ... )
will yield estimates b_hat_0, b_hat_1, b_hat_2 and b_hat_3
where b_hat_0 is an estimate of b0 = log(K) so an estimate of K is exp(b_hat_0),
b_hat_1 is an estimate of b1 = a,
b_hat_2 is an estimate of b2 = r,
b_hat_3 is an estimate of b3 = s


Best

Steve McKinney

>
> 
>
>
>
>
>
> On Mon, Apr 4, 2011 at 2:58 PM, Steven McKinney  wrote:
> >
> >> -Original Message-
> >> From: stephen sefick [mailto:ssef...@gmail.com]
> >> Sent: April-03-11 5:35 PM
> >> To: Steven McKinney
> >> Cc: R help
> >> Subject: Re: [R] Linear Model with curve fitting parameter?
> >>
> >> Steven:
> >>
> >> You are exactly right sorry I was confused.
> >>
> >>
> >> ###
> >> so log(y-intercept)+log(K) is a constant called b0 (is this right?)
> >
> > Doesn't look right to me based on the information you've provided.
> > I don't see anything labeled "y" in your previous emails, so I'm
> > not clear on what y is and how it relates to the original model
> > you described
> >
> >   > >> I have a model Q=K*A*(R^r)*(S^s)
> >   > >>
> >   > >> A, R, and S are data I have and K is a curve fitting parameter.
> >
> > If the model is
> >
> >   Q=K*A*(R^r)*(S^s)
> >
> > then
> >
> >   log(Q) = log(K) + log(A) + r*log(R) + s*log(S)
> >
> > Rearranging yields
> >
> >   log(Q) - log(A) = log(K) + r*log(R) + s*log(S)
> >
> > Let  Z = log(Q) - log(A) = log(Q/A)
> >
> > so
> >
> >   Z = log(K) + r*log(R) + s*log(S)
> >
> > and a linear model fit of
> >
> >   Z ~ log(R) + log(S)
> >
> > will yield parameter estimates for the linear equation
> >
> >   E(Z) = B0 + B1*log(R) + B2*log(S)
> >
> > (E(Z) = expected value of Z)
> >
> > so B0 estimate is an estimate of log(K)
> >   B1 estimate is an estimate of r
> >   B2 estimate is an estimate of s
> >
> > and these are the only parameters you described in the original model.
> >
> >
> >>
> >> lm(log(Q)~log(A)+log(R)+log(S)-1)
> >>
> >> is fitting the model
> >>
> >> log(Q)=a*log(A)+r*log(R)+s*log(S) (n

Re: [R] Linear Model with curve fitting parameter?

2011-04-04 Thread Steven McKinney

> -Original Message-
> From: stephen sefick [mailto:ssef...@gmail.com]
> Sent: April-03-11 5:35 PM
> To: Steven McKinney
> Cc: R help
> Subject: Re: [R] Linear Model with curve fitting parameter?
> 
> Steven:
> 
> You are exactly right sorry I was confused.
> 
> 
> ###
> so log(y-intercept)+log(K) is a constant called b0 (is this right?)

Doesn't look right to me based on the information you've provided.
I don't see anything labeled "y" in your previous emails, so I'm
not clear on what y is and how it relates to the original model 
you described

   > >> I have a model Q=K*A*(R^r)*(S^s)
   > >>
   > >> A, R, and S are data I have and K is a curve fitting parameter. 

If the model is 
  
   Q=K*A*(R^r)*(S^s)

then

   log(Q) = log(K) + log(A) + r*log(R) + s*log(S)

Rearranging yields

   log(Q) - log(A) = log(K) + r*log(R) + s*log(S)

Let  Z = log(Q) - log(A) = log(Q/A)

so

   Z = log(K) + r*log(R) + s*log(S)

and a linear model fit of
   
   Z ~ log(R) + log(S)

will yield parameter estimates for the linear equation

   E(Z) = B0 + B1*log(R) + B2*log(S)

(E(Z) = expected value of Z)

so B0 estimate is an estimate of log(K)
   B1 estimate is an estimate of r
   B2 estimate is an estimate of s

and these are the only parameters you described in the original model.


> 
> lm(log(Q)~log(A)+log(R)+log(S)-1)
> 
> is fitting the model
> 
> log(Q)=a*log(A)+r*log(R)+s*log(S) (no beta 0)
> 
> and
> 
> lm(log(Q)~log(A)+log(R)+log(S))
> 
> 
> is fitting the model
> 
> log(Q)=b0+a*log(A)+r*log(R)+s*log(S)

K has disappeared from these equations so these model fits do
not correspond to the model originally described.  Now a b0
appears, and is used in models below.  I think changing notation
is also adding confusion.  What are "y" and "intercept" you
discuss above, in relation to your original notation?

> 
> ##
> 
> These are the models I am trying to fit and if I have reasoned
> correctly above then I should be able to fit the below models
> similarly.

You will be able to fit models appropriately once you have a
clearly defined system of notation that allows you to map between
the proposed data model, the parameters in that model, and the
corresponding regression equations.  

Once you have consistent notation, you will be able to see
if you can express your model as a linear regression, or
if not, what kind of non-linear regression you will need to
do to get estimates for the parameters in your model.

Best

Steve McKinney

> 
> manning
> log(Q)=log(b0)+log(K)+log(A)+r*log(R)+s*log(S)
> 
> dingman
> log(Q)=log(b0)+log(K)+a*log(A)+r*log(R)+s*(log(S))^2
> 
> bjerklie
> log(Q)=log(b0)+log(K)+a*log(A)+r*log(R)+s*log(S)
> 
> ###
> 
> Thank you for all of your help!
> 
> Stephen
> 
> On Fri, Apr 1, 2011 at 2:44 PM, Steven McKinney  wrote:
> >
> >> -Original Message-
> >> From: stephen sefick [mailto:ssef...@gmail.com]
> >> Sent: April-01-11 5:44 AM
> >> To: Steven McKinney
> >> Cc: R help
> >> Subject: Re: [R] Linear Model with curve fitting parameter?
> >>
> >> Setting Z=Q-A would be the incorrect dimensions.  I could Z=Q/A.
> >
> > I suspect this is confusion about what Q is.  I was presuming that
> > the Q in this following formula was log(Q) with Q from the original data.
> >
> >> >> I have taken the log of the data that I have and this is the model
> >> >> formula without the K part
> >> >>
> >> >> lm(Q~offset(A)+R+S, data=x)
> >
> > If the model is
> >
> >   Q=K*A*(R^r)*(S^s)
> >
> > then
> >
> >   log(Q) = log(K) + log(A) + r*log(R) + s*log(S)
> >
> > Rearranging yields
> >
> >   log(Q) - log(A) = log(K) + r*log(R) + s*log(S)
> >
> > so what I labeled 'Z' below is
> >
> >   Z = log(Q) - log(A) = log(Q/A)
> >
> > so
> >
> >   Z = log(K) + r*log(R) + s*log(S)
> >
> > and a linear model fit of
> >
> >   Z ~ log(R) + log(S)
> >
> > will yield parameter estimates for the linear equation
> >
> >   E(Z) = B0 + B1*log(R) + B2*log(S)
> >
> > (E(Z) = expected value of Z)
> >
> > so B0 estimate is an estimate of log(K)
> >   B1 estimate is an estimate of r
> >   B2 estimate is an estimate of s
> >
> > More details and careful notation will eventually lead
> > to a reasonable description and analysis strategy.
> >
> >
> &g

Re: [R] Linear Model with curve fitting parameter?

2011-04-01 Thread Steven McKinney

> -Original Message-
> From: stephen sefick [mailto:ssef...@gmail.com]
> Sent: April-01-11 5:44 AM
> To: Steven McKinney
> Cc: R help
> Subject: Re: [R] Linear Model with curve fitting parameter?
> 
> Setting Z=Q-A would be the incorrect dimensions.  I could Z=Q/A.  

I suspect this is confusion about what Q is.  I was presuming that
the Q in this following formula was log(Q) with Q from the original data.

> >> I have taken the log of the data that I have and this is the model
> >> formula without the K part
> >>
> >> lm(Q~offset(A)+R+S, data=x)

If the model is 
  
   Q=K*A*(R^r)*(S^s)

then

   log(Q) = log(K) + log(A) + r*log(R) + s*log(S)

Rearranging yields

   log(Q) - log(A) = log(K) + r*log(R) + s*log(S)

so what I labeled 'Z' below is

   Z = log(Q) - log(A) = log(Q/A)

so

   Z = log(K) + r*log(R) + s*log(S)

and a linear model fit of
   
   Z ~ log(R) + log(S)

will yield parameter estimates for the linear equation

   E(Z) = B0 + B1*log(R) + B2*log(S)

(E(Z) = expected value of Z)

so B0 estimate is an estimate of log(K)
   B1 estimate is an estimate of r
   B2 estimate is an estimate of s

More details and careful notation will eventually lead
to a reasonable description and analysis strategy.


Best

Steve McKinney



> Is fitting a nls model the same as fitting an ols?  These data are
> hydraulic data from ~47 sites.  To access predictive ability I am
> removing one site fitting a new model and then accessing the fit with
> a myriad of model assessment criteria.  I should get the same answer
> with ols vs nls?  Thank you for all of your help.
> 
> Stephen
> 
> On Thu, Mar 31, 2011 at 8:34 PM, Steven McKinney  wrote:
> >
> >> -Original Message-
> >> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] 
> >> On Behalf Of stephen
> sefick
> >> Sent: March-31-11 3:38 PM
> >> To: R help
> >> Subject: [R] Linear Model with curve fitting parameter?
> >>
> >> I have a model Q=K*A*(R^r)*(S^s)
> >>
> >> A, R, and S are data I have and K is a curve fitting parameter.  I
> >> have linearized as
> >>
> >> log(Q)=log(K)+log(A)+r*log(R)+s*log(S)
> >>
> >> I have taken the log of the data that I have and this is the model
> >> formula without the K part
> >>
> >> lm(Q~offset(A)+R+S, data=x)
> >>
> >> What is the formula that I should use?
> >
> > Let Z = Q - A for your logged data.
> >
> > Fitting lm(Z ~ R + S, data = x) should yield
> > intercept parameter estimate = estimate for log(K)
> > R coefficient parameter estimate = estimate for r
> > S coefficient parameter estimate = estimate for s
> >
> >
> >
> > Steven McKinney
> >
> > Statistician
> > Molecular Oncology and Breast Cancer Program
> > British Columbia Cancer Research Centre
> >
> >
> >
> >>
> >> Thanks for all of your help.  I can provide a subset of data if necessary.
> >>
> >>
> >>
> >> --
> >> Stephen Sefick
> >> 
> >> | Auburn University                                         |
> >> | Biological Sciences                                      |
> >> | 331 Funchess Hall                                       |
> >> | Auburn, Alabama                                         |
> >> | 36849                                                           |
> >> |___|
> >> | sas0...@auburn.edu                                  |
> >> | http://www.auburn.edu/~sas0025                 |
> >> |___|
> >>
> >> Let's not spend our time and resources thinking about things that are
> >> so little or so large that all they really do for us is puff us up and
> >> make us feel like gods.  We are mammals, and have not exhausted the
> >> annoying little problems of being mammals.
> >>
> >>                                 -K. Mullis
> >>
> >> "A big computer, a complex algorithm and a long time does not equal 
> >> science."
> >>
> >>                               -Robert Gentleman
> >> __
> >> R-help@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide 
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> 
> 
> 
> --
> St

Re: [R] Linear Model with curve fitting parameter?

2011-03-31 Thread Steven McKinney

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf Of stephen sefick
> Sent: March-31-11 3:38 PM
> To: R help
> Subject: [R] Linear Model with curve fitting parameter?
> 
> I have a model Q=K*A*(R^r)*(S^s)
> 
> A, R, and S are data I have and K is a curve fitting parameter.  I
> have linearized as
> 
> log(Q)=log(K)+log(A)+r*log(R)+s*log(S)
> 
> I have taken the log of the data that I have and this is the model
> formula without the K part
> 
> lm(Q~offset(A)+R+S, data=x)
> 
> What is the formula that I should use?

Let Z = Q - A for your logged data.

Fitting lm(Z ~ R + S, data = x) should yield
intercept parameter estimate = estimate for log(K)
R coefficient parameter estimate = estimate for r
S coefficient parameter estimate = estimate for s



Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre



> 
> Thanks for all of your help.  I can provide a subset of data if necessary.
> 
> 
> 
> --
> Stephen Sefick
> 
> | Auburn University                                         |
> | Biological Sciences                                      |
> | 331 Funchess Hall                                       |
> | Auburn, Alabama                                         |
> | 36849                                                           |
> |___|
> | sas0...@auburn.edu                                  |
> | http://www.auburn.edu/~sas0025                 |
> |___|
> 
> Let's not spend our time and resources thinking about things that are
> so little or so large that all they really do for us is puff us up and
> make us feel like gods.  We are mammals, and have not exhausted the
> annoying little problems of being mammals.
> 
>                                 -K. Mullis
> 
> "A big computer, a complex algorithm and a long time does not equal science."
> 
>                               -Robert Gentleman
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] xlsx problem

2011-03-29 Thread Steven McKinney
Hi Cristoph,

Glad to hear it worked - replying to r-help in case others face this issue.


Cheers

Steven McKinney


From: Martin Knapp [mkna...@aucklanduni.ac.nz]
Sent: March 28, 2011 7:28 PM
To: Steven McKinney
Subject: Re: [R] xlsx problem

Thanks that worked.
still don't know what was wrong though but I can live with that.

Christoph

On 29 March 2011 14:16, Steven McKinney  wrote:
> Dear Christoph
>
> Are there formulas in any cells of sheet 6?
>
> One thing you might try is to open up the spreadsheet in Excel and create a 
> new sheet,
> call it Sheet 7.
>
> In Sheet 6, select all the cells with data (or just click the
> upper left icon at the border of the spreadsheet that selects the whole
> sheet, -a will also do this), then go to sheet 7 and
> do a Paste Special - Values
>
> If you just paste the values, then save the file and read Sheet 7, does that 
> work?
>
>
> While you're in the spreadsheet, see if you can spot any odd looking entries 
> on sheet 6.
> Excel lets people put just about anything, just about anywhere.
>
> Steven McKinney
> 
> From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf 
> Of Martin Knapp [mkna...@aucklanduni.ac.nz]
> Sent: March 28, 2011 1:43 PM
> To: r-help
> Subject: [R] xlsx problem
>
> Dear list,
> I'm running windows xp with R 2.12.0. I'm trying to load a excel
> spreadsheet into R using the xlsx package. I posted my code below with
> the error I get.
>
>> res <- read.xlsx("Copy of test_excel_input_data.xlsx", 6)
> Error in .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl,  :
>  java.lang.IllegalStateException: Cannot get a numeric value from a text cell
>
> There are 6 sheets in that file. 5 of them are read into a dataframe
> without problem. For the 6th one(the one I actually need) I get this
> error message.
> There is obviously something in the format of the sheet wrong. Has
> anyone an idea whats going wrong and what I have to change in the
> sheet?
>
> Thanks in advance,
>
> Christoph
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Points on Map

2011-03-28 Thread Steven McKinney
Here's a start...

> require("maps")
> foo <- map("state", "new york")
> lines(x = range(foo$x, na.rm = TRUE), y = range(foo$y, na.rm = TRUE))

You can figure out from this how to specify the coordinates that you
want for dividing up the map, put them in a file etc.

Steven McKinney


From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
Jaimin Dave [davejaim...@gmail.com]
Sent: March 28, 2011 11:12 AM
To: r-help@r-project.org
Subject: [R] Points on Map

Hi,
I am new to R and I want to plot points on the Map of New York .  I also
want to divide map of New York into small grids(not fixed) .I want that
these point should be plotted from a file.How can I do it?Any help would be
greatly appreciated.


Thanks
Jaimin

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rep for multiple categories

2011-03-28 Thread Steven McKinney
Hi Kathi,

Are you looking for something like this?

> Alist <- letters[1:4]
> Blist <- 1:5
> Clist <- LETTERS[24:26]
> Alist
[1] "a" "b" "c" "d"
> Blist
[1] 1 2 3 4 5
> Clist
[1] "X" "Y" "Z"

> expand.grid(Alist, Blist, Clist)
   Var1 Var2 Var3
1 a1X
2 b1X
3 c1X
4 d1X
5 a2X
6 b2X
7 c2X
8 d2X
9 a3X
10b3X
11c3X
12d3X
13a4X
14b4X
15c4X
16d4X
17a5X
18b5X
19c5X
20d5X
21a1Y
22b1Y
23c1Y
24d1Y
25a2Y
26b2Y
27c2Y
28d2Y
29a3Y
30b3Y
31c3Y
32d3Y
33a4Y
34b4Y
35c4Y
36d4Y
37a5Y
38b5Y
39c5Y
40d5Y
41a1Z
42b1Z
43c1Z
44d1Z
45a2Z
46b2Z
47c2Z
48d2Z
49a3Z
50b3Z
51c3Z
52d3Z
53a4Z
54b4Z
55c    4Z
56d4Z
57a5Z
58b5Z
59c5Z
60d5Z


Steven McKinney

From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
BORGMANN,Kathi [kborgm...@audubon.org]
Sent: March 28, 2011 11:43 AM
To: r-help@r-project.org
Subject: [R] rep for multiple categories

Hi,
I am R beginner and am trying to figure out how to generate a complete list of 
species for every point, visit, and year. The code below is close but does not 
give me a list of species for every point, visit, and year in my data set.

spplist<-unique(sumPtCt$Species)
spplength<-length(spplist)
Pointlist<-unique(sumPtCt$Point)
Pointlength<-length(Pointlist)
Visitlist<-unique(sumPtCt$Visit)
Visitlength<-length(Visitlist)
Yearlist<-unique(sumPtCt$Year)
Yearlength<-length(Yearlist)
s<-rep(spplist, each=Pointlength, Visitlength, Yearlength)
p<-rep(Pointlist, spplength)
v<-rep(Visitlist, spplength)
y<-rep(Yearlist, spplength)
template<-data.frame(Species=s,Point=p, Visit=v, Year=y)

###merge template and data and replace NAs with 0
FinalPtCt<-merge(template, sumPtCt, all=T)
FinalPtCt$Number[is.na(FinalPtCt$Number)]<-0


Essentially I have data that look like this
 SPP  Point  Visit  Number
BUFF   1  1   5
WEGR   1  1   10
CLGR   1  1   15
WEGU   2  15
RUDU   2  1   15
HOGR   2  15


But I want to generate this
Spp  Point  Visit   Number
BUFF  1   1   5
WEGR  1   1   10
CLGR  1   1   15
WEGU  1   1   0
RUDU  1   1   0
HOGR  1   1   0
WEGU  2   1   5
RUDU  2   1   15
HOGR  2   1   5
BUFF  2   1   0
WEGR  2   1   0
CLGR  2   1   0

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] xlsx problem

2011-03-28 Thread Steven McKinney
Dear Christoph

Are there formulas in any cells of sheet 6?

One thing you might try is to open up the spreadsheet in Excel and create a new 
sheet,
call it Sheet 7.

In Sheet 6, select all the cells with data (or just click the
upper left icon at the border of the spreadsheet that selects the whole
sheet, -a will also do this), then go to sheet 7 and
do a Paste Special - Values

If you just paste the values, then save the file and read Sheet 7, does that 
work?


While you're in the spreadsheet, see if you can spot any odd looking entries on 
sheet 6.
Excel lets people put just about anything, just about anywhere.

Steven McKinney

From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
Martin Knapp [mkna...@aucklanduni.ac.nz]
Sent: March 28, 2011 1:43 PM
To: r-help
Subject: [R] xlsx problem

Dear list,
I'm running windows xp with R 2.12.0. I'm trying to load a excel
spreadsheet into R using the xlsx package. I posted my code below with
the error I get.

> res <- read.xlsx("Copy of test_excel_input_data.xlsx", 6)
Error in .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl,  :
  java.lang.IllegalStateException: Cannot get a numeric value from a text cell

There are 6 sheets in that file. 5 of them are read into a dataframe
without problem. For the 6th one(the one I actually need) I get this
error message.
There is obviously something in the format of the sheet wrong. Has
anyone an idea whats going wrong and what I have to change in the
sheet?

Thanks in advance,

Christoph

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dont show zero values in line graph

2011-01-06 Thread Steven McKinney
How about this?

> x<-c(1:5,NA,NA,8:10)
> y<-1:10
> plot(0,0,xlim=c(0,10), ylim=c(0,10),type="n",main="Dont show the bloody 0 
> values!!")
> lines(x~y, col="blue", lwd=2, subset = !is.na(x))

NAs let you do lots of useful manipulations in R.


Steven McKinney


From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
LCOG1 [jr...@lcog.org]
Sent: January 6, 2011 7:10 PM
To: r-help@r-project.org
Subject: [R] Dont show zero values in line graph

Hey everyone,
   Im getting better at plotting my data but cant for the life of me figure
out how to show a line graph with missing data that doesnt continue the line
down to zero then back up to the remaining values.

Consider the following
x<-c(1:5,0,0,8:10)
y<-1:10

plot(0,0,xlim=c(0,10), ylim=c(0,10),type="n",main="Dont show the bloody 0
values!!")
lines(x~y, col="blue", lwd=2,)

My data is missing the 6th and 7th values and they come in as NA's so i
change them to 0s but then the plot has these ugly lines that dive toward
the x axis then back up.  I would do bar plots but i need to show multiple
sets of data on the same and side by side bars doesnt do it for me.

So i need a line graph that starts and stops where 0s or missing values
exist.  Thoughts?

JR
--
View this message in context: 
http://r.789695.n4.nabble.com/Dont-show-zero-values-in-line-graph-tp3178566p3178566.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Schedule R to run automatically

2010-12-13 Thread Steven McKinney

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf Of joeponzio
> Sent: December-13-10 4:13 PM
> To: r-help@r-project.org
> Subject: [R] Schedule R to run automatically
> 
> 
> Current I have raw data from several projects that is written to a drive on a
> daily basis.  I would like to run a certain R syntax against these data sets
> (varying somewhat from data set to data set) every day at, say, 7am (with no
> human interaction).  Is this possible to set up in windows or unix?

Yes, in both.

In Windows, you can use the "Task Scheduler" (Control Panel - Administrative 
Tools
on Windows 7).  Set up the R script you want to run daily, and invoke it
with the R CMD BATCH command.

In Unix, you can use the cron scheduler to do the same.


Steven McKinney



> 
> Thanks for any input.
> 
> 
> 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Schedule-R-to-run-automatically-
> tp3086272p3086272.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Tukey Test, lme, error: less than two groups

2010-12-03 Thread Steven McKinney
Just to close this thread, Lilith provided the data which was in a .csv text 
file
and had multiple lines of blank data at the end

species;code;treatment;pretreatment;provenance;greenhouse;individual;leaf;Date;DataPAM
Ae;c-ae-1-1-3;C;C;1;1;3;1;25.05.10 14:00; 0.665
.
.
.
Ae;w-ae-6-3-4;C;W;6;3;4;3;23.06.10 12:30; 0.622
;
;
;
;
;
;
;
;
;
;
;

Removing the blank (NA) rows of data and keeping all variables in the dataframe 
resolves the issue.

(Free floating variables yield this error
> summary(glht(PAM.lme, linfct = mcp(Provenancef = "Tukey")))
Error in `[.data.frame`(mf, nhypo[checknm]) : undefined columns selected)


Steven McKinney

From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
Steven McKinney [smckin...@bccrc.ca]
Sent: December 2, 2010 2:03 PM
To: 'Lilith'; r-help@r-project.org
Subject: Re: [R] Tukey Test, lme, error: less than two groups

Comments in-line below

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf Of Lilith
> Sent: December-02-10 9:39 AM
> To: r-help@r-project.org
> Subject: [R] Tukey Test, lme, error: less than two groups
>
>
> Dear R-group,
>
> I am trying desperately to get this Tukey test working. Its my first time
> here so I hope my question is not too stupid, but I couldn't find anything
> helpful in the help or in the forum.
>
> I am analysing a dataset of treated grasses. I want to find out, if the
> grasses from different Provenances react differently.
> In the aov test I found a significance for the combination Treatment and
> provenance:
>
> summary(PAMaov<-aov(PAMval~Treatmentf*Pretreatmentf*Provenancef+Error(Datef/Code)))
>
> Treatmentf:Provenancef   p-value:  0.008023 **
>
> In the Linear fixed effects model lme, I can see that there is a
> significance for two provenances (HU and ES)
>
> summary(PAM.lme<-lme(PAMval~Treatmentf*Provenancef*Pretreatmentf, random=
> ~1|Datef/Code,na.action=na.omit))
>
>   Value  Std.Error  DF   t-value
> p-value
> (Intercept)   0.6890317 0.06117401 994 11.263473
> 0.
> TreatmentfF  -0.2897619 0.05484590 467 -5.283201
> 0.
> ProvenancefDE 0.0105873 0.05484590 467  0.193037
> 0.8470
>
> TreatmentfF:ProvenancefES 0.1647302 0.08226884 467  2.002340
> 0.0458
> TreatmentfF:ProvenancefHU 0.1569524 0.07756381 467  2.023526
> 0.0436
>
> No the big mystery is the Tukey test. I just can't find the mistake, it
> keeps telling me, that there are " less than two groups"
>
> summary(glht(PAM.lme, linfct = mcp(Provenancef = "Tukey")))
>
> Fehler in contrMat(table(mf[[nm]]), type = types[pm]) :
>   less than two groups
>
> I guess its important to know that I made factors out of some of the data.
> Here is the code:
>
>
> PAMdata$provenance[PAMdata$provenance == "5"] = "ES"
> PAMdata$provenance[PAMdata$provenance == "6"] = "HU"
> # etc.
>
> Treatmentf <- factor(PAMdata$treatment, levels=c("C","F"))
> Datef <- factor(PAMdata$Date, levels=c( "25.05.10 14:00","26.05.10
> 19:00","27.05.2010 7:30","27.05.10 14:00","01.06.10 14:00","02.06.10
> 19:00","23.06.10 12:30"),ordered=TRUE)
>
>
> Pretreatmentf <- as.factor(PAMdata$pretreatment)
> Provenancef <- as.factor(PAMdata$provenance)
> Greenhousef <- as.factor(PAMdata$greenhouse)
> Individualf <- as.factor(PAMdata$individual)
>
> PAMval <- (PAMdata$DataPAM)
> Code<-(PAMdata$code)

I suspect the problem is the creation of all these individual variables.

Try instead

PAMdata$Treatmentf <- factor(PAMdata$treatment, levels=c("C","F"))
PAMdata$Datef <- factor(PAMdata$Date, levels=c( "25.05.10 14:00","26.05.10
 19:00","27.05.2010 7:30","27.05.10 14:00","01.06.10 14:00","02.06.10
 19:00","23.06.10 12:30"),ordered=TRUE)
...
PAMdata$PAMval <- (PAMdata$DataPAM)
PAMdata$Code<-(PAMdata$code)
etc.

so that all of your required variables are variables in the dataframe PAMdata.

When you pass off fitted model objects to additional functions, the
additional functions often require access to the dataframe used in the
initial modeling.  Then call lme with

summary(PAM.lme<-lme(PAMval~Treatmentf*Provenancef*Pretreatmentf, random=
 ~1|Datef/Code, data = PAMdata, na.action=na.omit))

then try

summary(glht(PAM

Re: [R] Tukey Test, lme, error: less than two groups

2010-12-02 Thread Steven McKinney
Comments in-line below

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf Of Lilith
> Sent: December-02-10 9:39 AM
> To: r-help@r-project.org
> Subject: [R] Tukey Test, lme, error: less than two groups
> 
> 
> Dear R-group,
> 
> I am trying desperately to get this Tukey test working. Its my first time
> here so I hope my question is not too stupid, but I couldn't find anything
> helpful in the help or in the forum.
> 
> I am analysing a dataset of treated grasses. I want to find out, if the
> grasses from different Provenances react differently.
> In the aov test I found a significance for the combination Treatment and
> provenance:
> 
> summary(PAMaov<-aov(PAMval~Treatmentf*Pretreatmentf*Provenancef+Error(Datef/Code)))
> 
> Treatmentf:Provenancef   p-value:  0.008023 **
> 
> In the Linear fixed effects model lme, I can see that there is a
> significance for two provenances (HU and ES)
> 
> summary(PAM.lme<-lme(PAMval~Treatmentf*Provenancef*Pretreatmentf, random=
> ~1|Datef/Code,na.action=na.omit))
> 
>   Value  Std.Error  DF   t-value
> p-value
> (Intercept)   0.6890317 0.06117401 994 11.263473
> 0.
> TreatmentfF  -0.2897619 0.05484590 467 -5.283201
> 0.
> ProvenancefDE 0.0105873 0.05484590 467  0.193037
> 0.8470
> 
> TreatmentfF:ProvenancefES 0.1647302 0.08226884 467  2.002340
> 0.0458
> TreatmentfF:ProvenancefHU 0.1569524 0.07756381 467  2.023526
> 0.0436
> 
> No the big mystery is the Tukey test. I just can't find the mistake, it
> keeps telling me, that there are " less than two groups"
> 
> summary(glht(PAM.lme, linfct = mcp(Provenancef = "Tukey")))
> 
> Fehler in contrMat(table(mf[[nm]]), type = types[pm]) :
>   less than two groups
> 
> I guess its important to know that I made factors out of some of the data.
> Here is the code:
> 
> 
> PAMdata$provenance[PAMdata$provenance == "5"] = "ES"
> PAMdata$provenance[PAMdata$provenance == "6"] = "HU"
> # etc.
> 
> Treatmentf <- factor(PAMdata$treatment, levels=c("C","F"))
> Datef <- factor(PAMdata$Date, levels=c( "25.05.10 14:00","26.05.10
> 19:00","27.05.2010 7:30","27.05.10 14:00","01.06.10 14:00","02.06.10
> 19:00","23.06.10 12:30"),ordered=TRUE)
> 
> 
> Pretreatmentf <- as.factor(PAMdata$pretreatment)
> Provenancef <- as.factor(PAMdata$provenance)
> Greenhousef <- as.factor(PAMdata$greenhouse)
> Individualf <- as.factor(PAMdata$individual)
> 
> PAMval <- (PAMdata$DataPAM)
> Code<-(PAMdata$code)

I suspect the problem is the creation of all these individual variables.

Try instead

PAMdata$Treatmentf <- factor(PAMdata$treatment, levels=c("C","F"))
PAMdata$Datef <- factor(PAMdata$Date, levels=c( "25.05.10 14:00","26.05.10
 19:00","27.05.2010 7:30","27.05.10 14:00","01.06.10 14:00","02.06.10
 19:00","23.06.10 12:30"),ordered=TRUE)
...
PAMdata$PAMval <- (PAMdata$DataPAM)
PAMdata$Code<-(PAMdata$code)
etc.

so that all of your required variables are variables in the dataframe PAMdata.

When you pass off fitted model objects to additional functions, the
additional functions often require access to the dataframe used in the
initial modeling.  Then call lme with

summary(PAM.lme<-lme(PAMval~Treatmentf*Provenancef*Pretreatmentf, random=
 ~1|Datef/Code, data = PAMdata, na.action=na.omit))

then try

summary(glht(PAM.lme, linfct = mcp(Provenancef = "Tukey")))

again.
 
(If you still get an error, run the

traceback()

command and provide that information.)


I'm also wondering why no term for Pretreatmentf shows in your model output.
After setting up the factor variables in the PAMdata dataframe, what does the
command

with(PAMdata, table(Pretreatmentf, Provenancef, Treatmentf))

show?  Is Pretreatmentf even needed in the model?

The output of the command

sessionInfo() 

is also useful to help people figure out such issues.

Also, if you can share the data, or a mock-up of it, others will
be able to run code examples, and not just guess.

HTH

Steve McKinney

> 
> Thank you for any hint! That Tukey test seems so easy, I just can't find the
> mistake
> Thank you very much fpr your help and greetings from Tanzania,
> Lilith
> 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Tukey-Test-lme-error-less-than-two-groups-
> tp3069789p3069789.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commente

Re: [R] wilcox.test; data type conversion?

2010-10-28 Thread Steven McKinney
You can set up the data as

> grade <- ordered(c("MVG", "VG", "VG", "G", "MVG", "G", "VG", "G", "VG"), 
> levels = c("G", "VG", "MVG"))
> grade
[1] MVG VG  VG  G   MVG G   VG  G   VG 
Levels: G < VG < MVG
> sex <- factor(c( "male", "male", "female", "male", "female", "male", 
> "female", "male", "male"), levels = c("male", "female"))
> sex
[1] male   male   female male   female male   female male   male  
Levels: male female
> gradesbysex <- data.frame(grade, sex)
> 
> gradesbysex
  gradesex
1   MVG   male
2VG   male
3VG female
4 G   male
5   MVG female
6 G   male
7VG female
8 G   male
9VG   male

Now for the Wilcoxon-Mann_Whitney test

> wilcox.test(grade ~ sex, data = gradesbysex)
Error in wilcox.test.default(x = c(3L, 2L, 1L, 1L, 1L, 2L), y = c(2L,  : 
  'x' must be numeric

I'm not sure if anyone has written a version that will work on ordered factor 
variables,
but you can coerce the ordered factor to its underlying integer representation 
with e.g.

> wilcox.test(as.integer(grade) ~ sex, data = gradesbysex)

Wilcoxon rank sum test with continuity correction

data:  as.integer(grade) by sex 
W = 4.5, p-value = 0.2695
alternative hypothesis: true location shift is not equal to 0 

Warning message:
In wilcox.test.default(x = c(3L, 2L, 1L, 1L, 1L, 2L), y = c(2L,  :
  cannot compute exact p-value with ties

You can break the ties by jittering the data.  Each jitter will of course
produce different tie breakers.  A few repeats of the test, or a loop and
some summaries of the outcomes, will give you an idea of the
"average" result.

> wilcox.test(jitter(as.integer(grade)) ~ sex, data = gradesbysex)

Wilcoxon rank sum test

data:  jitter(as.integer(grade)) by sex 
W = 4, p-value = 0.2619
alternative hypothesis: true location shift is not equal to 0 

> wilcox.test(jitter(as.integer(grade)) ~ sex, data = gradesbysex)

Wilcoxon rank sum test

data:  jitter(as.integer(grade)) by sex 
W = 3, p-value = 0.1667
alternative hypothesis: true location shift is not equal to 0 

> wilcox.test(jitter(as.integer(grade)) ~ sex, data = gradesbysex)

Wilcoxon rank sum test

data:  jitter(as.integer(grade)) by sex 
W = 7, p-value = 0.7143
alternative hypothesis: true location shift is not equal to 0 

> wilcox.test(jitter(as.integer(grade)) ~ sex, data = gradesbysex)

Wilcoxon rank sum test

data:  jitter(as.integer(grade)) by sex 
W = 6, p-value = 0.5476
alternative hypothesis: true location shift is not equal to 0 


I'll let you judge elegance.


As for the barplots, I think all you need to do is specify the row and column 
order you'd like.

Try this example

> barplot(VADeaths, beside = TRUE)
> barplot(VADeaths[5:1,c(4, 2, 3, 1)], beside = TRUE)

Substitute your data, use beside=FALSE to stack, etc.

Steven McKinney


From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
Par Leijonhufvud [...@hunter-gatherer.org]
Sent: October 28, 2010 8:37 PM
To: rhelp
Subject: [R] wilcox.test; data type conversion?

I'm working on a quick tutorial for my students, and was planning on
using Mann-Whitney U as one of the tests.

I have the following (fake) data

 grade <- c("MVG", "VG", "VG", "G", "MVG", "G", "VG", "G", "VG")
 sex <- c( "male", "male", "female", "male", "female", "male", "female", 
"male", "male")
 gradesbysex <- data.frame(grade, sex)

The grades is in the Swedish system, where the order is G < VG < MVG

The idea is that they will investigate if they can show a grade
difference by sex (i.e. that the teacher gives better grades to boys or
girls).

Since the wilcox.test needs the order of the grades it wants numeric
vector  for the data. Is there a good and simple (i.e. student
compatible) way to handle this? I could tell them to enter data as
numbers instead, but an elegant way to do this inside R would be
preferable.


On the same theme, is there a way to tell barplot that, when making
stacked barplots, to stack the data in a particular order (default
appears to be alphabetical)?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to write to sqlite files

2010-10-19 Thread Steven McKinney
You will need to install the RSQLite package, see e.g.

http://cran.r-project.org/web/packages/RSQLite/index.html

Review the installation instructions there for the setup appropriate for your 
situation.

Review the pdf manual there for examples of command sequences involved with
connecting to the database, and running data queries.



Steven McKinney


From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
lord12 [trexi...@yahoo.com]
Sent: October 19, 2010 11:42 AM
To: r-help@r-project.org
Subject: [R] How to write to sqlite files

In R, I know how to write ti csv files. However, how do I write to database
files?
--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-write-to-sqlite-files-tp3002586p3002586.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help

2010-10-14 Thread Steven McKinney
Here's one way to do it:

Use unlist() to change the function's output from a list to a vector.

> f <- function(x) {return(lapply(1:6, function(z) z + x))}
> f(2)
[[1]]
[1] 3

[[2]]
[1] 4

[[3]]
[1] 5

[[4]]
[1] 6

[[5]]
[1] 7

[[6]]
[1] 8

> x <- seq(0,1, by=0.1)
> result <- matrix(0, nrow=6, ncol=length(x))
> for (i in 1:length(x)){result[,i] <- unlist(f(x[i]))}
> result
 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
[1,]1  1.1  1.2  1.3  1.4  1.5  1.6  1.7  1.8   1.9 2
[2,]2  2.1  2.2  2.3  2.4  2.5  2.6  2.7  2.8   2.9 3
[3,]3  3.1  3.2  3.3  3.4  3.5  3.6  3.7  3.8   3.9 4
[4,]4  4.1  4.2  4.3  4.4  4.5  4.6  4.7  4.8   4.9 5
[5,]5  5.1  5.2  5.3  5.4  5.5  5.6  5.7  5.8   5.9 6
[6,]6  6.1  6.2  6.3  6.4  6.5  6.6  6.7  6.8   6.9 7
> 

HTH

Steven McKinney


From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
li li [hannah@gmail.com]
Sent: October 14, 2010 2:50 PM
To: r-help
Subject: [R] help

Dear all,
  I have a function f(x)  which return a list as result.

$T1
[1] 0.03376190
$T2
[1] 0.04725
$T3
[1] 0.3796071
$T4
[1] 0.3713452
$T5
[1] 0.4523651
$T6
[1] 0.4575873

  I now find the result for a vector of x values at one time. I want to
store the reuslt
for each xi value in a column of a matrix

 x <- seq(0,1, by=0.1)
result <- matrix(0, nrow=6, ncol=length(x))

for (i in 1:length(x)){result[,i] <- f(x[i])}

It is not working. Can some help me.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [Rd] Selecting multiple columns with same name

2010-10-08 Thread Steven McKinney
This is better suited for R-help than R-devel, so I'm copying to the R-help 
list:

> -Original Message-
> From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] On 
> Behalf Of Martin Kerr
> Sent: October-08-10 3:09 AM
> To: r-de...@r-project.org
> Subject: [Rd] Selecting multiple columns with same name
> 
> 
> Hello all,
> I've been working on a project involving clustering algorithms and I've hit a 
> bit of a snag.
> I have my main data frame with is 31 X 1000, I have fed this into dif and 
> hclust in order to produce a
> 31 item vector stating the perceived grouping of the columns.E.g.
> 1 1 1 1 2 2 2 2 1 1 1 1 2 2 2 2 3 3 3 3
> etc.
> What I want to do is use this information to separate each groups worth of 
> data into a separate frame
> so I can perform additional calculations on them.I've been attempting to use 
> subset by setting the
> colnames to the grouping results thus:
> colnames(dataFrame) <- groupssubset(dataFrame,select=c(colname="1")

Here's one way to do it


> fdf <- as.data.frame(matrix(rnorm(100), ncol = 10))
> fdf
V1 V2 V3   V4  V5  V6   
   V7 V8
1   0.35264797 -0.4280407  0.4706150 -0.772936086  0.59984719  0.97885696  
0.13569457  0.5005072
2  -0.09800830 -0.3946618 -0.6816040 -0.173057585 -0.95377116  1.32702531  
0.51894946  1.8779715
3   0.00585569  0.5240508  0.6334294  0.775787713  1.13537433 -0.75363920  
0.09240357  1.7652420
4  -1.28667042 -0.3808195 -1.3735447  0.601288920  0.37448709  1.20875897  
1.26392905  0.3573046
5   1.05127892 -0.1717773  0.4795011  0.408584918 -1.57947076 -1.76699298 
-2.15778156 -0.6202422
6   0.49935805 -0.5858645  0.1466443  1.094320479 -0.01534562  0.03349714 
-0.86508986  0.3335337
7   0.64649298 -0.8044967  1.7273739  0.005654138  0.88092416 -0.43467177  
0.33123616 -1.0062133
8   0.67393707 -0.8927181  1.9050954  0.824576116 -1.49872072  0.1361 
-0.98904113 -1.1763053
9  -0.06217531 -0.6020426 -0.5198348  0.475774170  0.72492806 -1.93507347 
-0.26827918 -0.7902781
10 -4.05961249 -1.1839906 -2.1285662  0.992767748 -1.45187700 -0.32688422  
0.92335149  0.2405690
V9V10
1  -1.10422899  0.7343708
2  -0.21511926 -0.3472193
3  -1.56249900  0.6228027
4  -1.64679524  0.9548577
5   0.31530976  0.7420800
6   0.02644282 -1.0393438
7  -0.70669500 -0.8335578
8  -0.29898269  1.8679939
9  -0.08449491 -0.7413130
10  0.66960457 -0.464
> colGroups <- c(1,1,1,2,1,3,3,2,3,1)
> fdf[, colGroups == 1]
V1 V2 V3  V5V10
1   0.35264797 -0.4280407  0.4706150  0.59984719  0.7343708
2  -0.09800830 -0.3946618 -0.6816040 -0.95377116 -0.3472193
3   0.00585569  0.5240508  0.6334294  1.13537433  0.6228027
4  -1.28667042 -0.3808195 -1.3735447  0.37448709  0.9548577
5   1.05127892 -0.1717773  0.4795011 -1.57947076  0.7420800
6   0.49935805 -0.5858645  0.1466443 -0.01534562 -1.0393438
7   0.64649298 -0.8044967  1.7273739  0.88092416 -0.8335578
8   0.67393707 -0.8927181  1.9050954 -1.49872072  1.8679939
9  -0.06217531 -0.6020426 -0.5198348  0.72492806 -0.7413130
10 -4.05961249 -1.1839906 -2.1285662 -1.45187700 -0.464
> fdf[, colGroups == 2]
 V4 V8
1  -0.772936086  0.5005072
2  -0.173057585  1.8779715
3   0.775787713  1.7652420
4   0.601288920  0.3573046
5   0.408584918 -0.6202422
6   1.094320479  0.3335337
7   0.005654138 -1.0062133
8   0.824576116 -1.1763053
9   0.475774170 -0.7902781
10  0.992767748  0.2405690
> fdf[, colGroups == 3]
V6  V7  V9
1   0.97885696  0.13569457 -1.10422899
2   1.32702531  0.51894946 -0.21511926
3  -0.75363920  0.09240357 -1.56249900
4   1.20875897  1.26392905 -1.64679524
5  -1.76699298 -2.15778156  0.31530976
6   0.03349714 -0.86508986  0.02644282
7  -0.43467177  0.33123616 -0.70669500
8   0.1361 -0.98904113 -0.29898269
9  -1.93507347 -0.26827918 -0.08449491
10 -0.32688422  0.92335149  0.66960457
> 

and this can be automated as a loop or with lapply() and the like.

HTH

Steve McKinney

> This however only returns the first column rather than all instances of a 
> column with that name. Note
> that these columns may not necessarily be contiguous.
> Is this the correct way to go about this?
> Thank You
> Martin Kerr
>   [[alternative HTML version deleted]]
> 
> __
> r-de...@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] interactive session

2010-09-30 Thread Steven McKinney
For those that want it all...

> {cat("?"); a<-readLines(n=1)
+ print("hey")
+ print(b<-paste("t",a,sep=""))}
?ada
[1] "hey"
[1] "tada"
> b
[1] "tada"
> 


Steven McKinney


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf Of Peter Dalgaard
> Sent: September-30-10 1:23 PM
> To: Pam
> Cc: r-help@r-project.org
> Subject: Re: [R] interactive session
> 
> On 09/30/2010 03:33 PM, Pam wrote:
> > Thanks Niels but it won't do.. please copy and paste the 2 lines below 
> > together
> > to your console in order to see what I mean:
> >
> > cat("?"); a<-readLines(n=1)
> > b<-paste("t",a,sep="")
> >
> > anyone / any idea to overcome this problem?
> >
> > Best,
> > Fatih
> >
> 
> You might want to source() a file with those lines rather than pasting
> them to the console. There's just no way you can retroactively insert
> text between two already submitted lines.
> 
> You can do things like this, though
> 
> {cat("?"); a<-readLines(n=1)
> "hey"
>  b<-paste("t",a,sep="")}
> 
> However, there's a catch
> 
> > {cat("?"); a<-readLines(n=1)
> + "hey"
> +  b<-paste("t",a,sep="")}
> ?ada
> > b
> [1] "tada"
> 
> Notice that the "hey" doesn't print.
> 
> --
> Peter Dalgaard
> Center for Statistics, Copenhagen Business School
> Phone: (+45)38153501
> Email: pd@cbs.dk  Priv: pda...@gmail.com
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] bptest

2010-09-24 Thread Steven McKinney
Hi Rob,

I googled "cran bptest"
and the first hit was for package "lmtest".

I installed lmtest, loaded it up, and bptest is available there.

> require("lmtest")
Loading required package: lmtest
Loading required package: zoo
> bptest
function (formula, varformula = NULL, studentize = TRUE, data = list()) ...


So see if you can get package lmtest.

HTH

Cheers


Steven McKinney


From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
robm [r.malp...@ntlworld.com]
Sent: September 24, 2010 6:53 AM
To: r-help@r-project.org
Subject: [R] bptest

Hi

I'm very new to R but have plenty of experience with statistics and other
packages like SPSS, SAS etc.

I have a dataset of around 20 columns and 200 rows.   I'm trying to fit a
very simple linear model between two variables.   Having done so, I want to
test the model for heteroscedasticity using the Breusch-Pagan test.
Apparently this is easy in R by simply doing

bptest(modelCH, data=KP)

I've tried this but I'm told it cannot find function bptest.   It's here
where I'm struggling.   I'm probably wrong but as far as I can see, bptest
is part of the lm package which, as far as I know, I have installed.

Irrespective of the fact I'm not sure how to tell bptest which is the
dependent and explanatory variables - there's a more fundamental problem if
it can't find the bptest function.

I have searched the documentation - albeit briefly so if anyone could help
I'd be very grateful

Rob

QBE Management
--
View this message in context: 
http://r.789695.n4.nabble.com/bptest-tp2553506p2553506.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] French accents on characters

2010-08-04 Thread Steven McKinney

In addition to the reply from Duncan Murdoch,

help("text")
and
demo("Hershey")

are very informative in this regard.

Simple example with plot for added text:

plot(1,1)
text(0.8, 0.8, "B\\`ad Fr\\'ench", vfont=c("serif", "plain"))
title(main = "B\u{E0}d Fr\u{E9}nch")




>From the help page for plotmath in package grDevices, see this
example (to which I added two additional entries for
a_grave and a_acute) for added text:

> plot.new(); plot.window(c(0,4), c(15,1))
> text(1, 1, "universal", adj=0); text(2.5, 1,  "\\042")
> text(3, 1, expression(symbol("\042")))
> text(1, 2, "existential", adj=0); text(2.5, 2,  "\\044")
> text(3, 2, expression(symbol("\044")))
> text(1, 3, "suchthat", adj=0); text(2.5, 3,  "\\047")
> text(3, 3, expression(symbol("\047")))
> text(1, 4, "therefore", adj=0); text(2.5, 4,  "\\134")
> text(3, 4, expression(symbol("\134")))
> text(1, 5, "perpendicular", adj=0); text(2.5, 5,  "\\136")
> text(3, 5, expression(symbol("\136")))
> text(1, 6, "circlemultiply", adj=0); text(2.5, 6,  "\\304")
> text(3, 6, expression(symbol("\304")))
> text(1, 7, "circleplus", adj=0); text(2.5, 7,  "\\305")
> text(3, 7, expression(symbol("\305")))
> text(1, 8, "emptyset", adj=0); text(2.5, 8,  "\\306")
> text(3, 8, expression(symbol("\306")))
> text(1, 9, "angle", adj=0); text(2.5, 9,  "\\320")
> text(3, 9, expression(symbol("\320")))
> text(1, 10, "leftangle", adj=0); text(2.5, 10,  "\\341")
> text(3, 10, expression(symbol("\341")))
> text(1, 11, "rightangle", adj=0); text(2.5, 11,  "\\361")
> text(3, 11, expression(symbol("\361")))
> text(1, 12, "a_grave", adj=0); text(2.5, 12,  "\\'a")
> text(3, 12, "\\'a", vfont=c("serif", "plain"))
> text(1, 13, "a_acute", adj=0); text(2.5, 13,  "\\`a")
> text(3, 13, "\\`a", vfont=c("serif", "plain"))
> 

Run 
> demo("Hershey")
and look at the functions defined at the beginning of the demo
for more examples of writing commands to get these accented
letters and other glyphs into plots.


http://www.unicode.org/charts/
has useful information on unicode values for all kinds
of alphabets.  e.g. Click on the "Latin-1 Supplements" link
(URL is:  http://www.unicode.org/charts/PDF/U0080.pdf)
to get a PDF with unicode values as used above
(e.g. \u{E0} for a-aigu).


HTH


Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre




From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
Jennifer Young [jennifer.yo...@math.mcmaster.ca]
Sent: August 4, 2010 12:40 PM
To: r-help@r-project.org
Subject: [R] French accents on characters

Hello

Could someone please direct me to the correct commands for adding accents
(grave and aigu) to a letter in a plot title, label, or in added text? I'm
sure there's a handy list somewhere, but I've failed in coming up with the
correct search words to find it.

Thank you muchly!
Jen

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] multiple R sessions from one working directory using GNU screen

2010-08-03 Thread Steven McKinney
Dear Olga

An R session is conducted entirely in the RAM memory of your computer,
and each invocation of R will have its own memory space, not shared
with any other application, including another R session.

You will have to architect a scheme to allow one R session to find out
about events and objects in another R session.  This might involve writing
files to disk from session R1, having session R2 check the directory
for new files every now and then, and so on.  (save.image() saves
objects from one R session to a file on disk, but there can be other items
in the saved file such as environments that are not straightforward
to manipulate or investigate from another R session.)

Your situation is similar to cluster computing, where a task is broken up
into independent pieces and each piece is handled by a separate R process.
Reading about cluster computing with R might help you figure out
strategies to share bits of data and information across multiple R sessions.

HTH


Steven McKinney

From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
Olga Lyashevska [o...@herenstraat.nl]
Sent: August 3, 2010 8:04 AM
To: r-help@r-project.org
Cc: Olga Lyashevska
Subject: [R] multiple R sessions from one working directory using GNU screen

Dear all,

I am using GNU screen to run multiple R sessions from one working
directory in order to split task, however I noticed that dataset is not
synchronized e.g. if I have two sessions R1 and R2, and I remove an
object from R1, R2 doesn't change as expected or change at random.
I have tried to save.image(), q() and then restart both sessions, but it
does not help.


Any suggestions?

Many thanks
Olga

R version 2.11.1 (2010-05-31)
2010 x86_64 GNU/Linux
Ubuntu 10.04.1 LTS
Intel(R) Xeon(R) CPU X3220 @ 2.40GHz
MemTotal: 4050180 kB

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to do a IF ELSE in a matrix format

2010-08-02 Thread Steven McKinney
Here's one way:

edu[edu[, 1] < 0.5, 2] <- 1

> set.seed(1)
> edu<-matrix(0,1000,2)
> edu1<-runif(1000,0,1)
> edu[,1]<-edu1
> head(edu)
  [,1] [,2]
[1,] 0.26550870
[2,] 0.37212390
[3,] 0.57285340
[4,] 0.90820780
[5,] 0.20168190
[6,] 0.89838970
> edu[edu[, 1] < 0.5, 2] <- 1
> head(edu)
  [,1] [,2]
[1,] 0.26550871
[2,] 0.37212391
[3,] 0.57285340
[4,] 0.90820780
[5,] 0.20168191
[6,] 0.89838970
> 


Steven McKinney





> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf Of Hey Sky
> Sent: August-02-10 6:17 PM
> To: r-help@r-project.org
> Subject: [R] how to do a IF ELSE in a matrix format
> 
> Hey, Rers
> 
> I am new to R and this question may has been asked many times or a very old
> one. I have checked
> 
> the archive but found nothing (maybe I used the wrong key words). any advise 
> is
> appreciated
> 
> the question is:
> 
> I try to replace a col vector with some other values, see
> 
> edu<-matrix(0,1000,2)
> 
> edu1<-runif(1000,0,1)
> edu[,1]<-edu1
> 
> the colum  vector edu[,1]  would have value between 0 and 1. if I want to
> replace the value of edu[,2]
> 
> to 1 if edu[,1] less than 0.5, how to do it in the form of matrix but 
> not under
> a loop structure?
> 
> I have tried:
> 
> if (edu[,1]>=0.5) edu[,2]=1
> 
> there is only the first 0.5 have been done but not all the others. thanks for
> any kind answer.
> 
> Nan
> from Montreal
> 
> 
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to find non-ASCII characters in .Rd files?

2010-07-29 Thread Steven McKinney
Does function 
   showNonASCII(x)
in package "tools" do what you need?


Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre




From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
Michael Friendly [frien...@yorku.ca]
Sent: July 29, 2010 6:27 PM
To: r-help
Subject: [R] how to find non-ASCII characters in .Rd files?

[Env: Win XP Pro / R 2.11.1]

I keep occasionally running into the annoying problem of getting
warnings from R CMD check regarding
non ASCII characters in .Rd files, without any easy way of finding them.

Mostly these come from copy/paste of references or other material from
web pages or Win applications,
and often involve varieties of quote-like characters, e.g., if I try to
include output from anova()
as a documentation comment,

#Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

These are extremely hard to spot in an editor and
R CMD check doesn't give line numbers or indicate what non-ASCII
characters were found,
so the only way I can easily find the offending characters is to copy
the .Rd file to a linux machine
and run
od -c file.Rd |more

What commands can I run in R against a .Rd file to locate such non-ASCII
characters


--
Michael Friendly Email: frien...@yorku.ca
Professor, Psychology Dept.
York University  Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele Streethttp://www.math.yorku.ca/SCS/friendly.html
Toronto, ONT  M3J 1P3 CANADA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] re-ordering bwplot

2010-07-23 Thread Steven McKinney

Here's one way - you can reorder the levels of a factor variable by specifying
the levels the way you want them.  

> require("lattice")
Loading required package: lattice
> bwplot( conc ~ Type : Treatment, data = CO2 )
> str(CO2$Type)
 Factor w/ 2 levels "Quebec","Mississippi": 1 1 1 1 1 1 1 1 1 1 ...
> 
> str(CO2$Treatment)
 Factor w/ 2 levels "nonchilled","chilled": 1 1 1 1 1 1 1 1 1 1 ...
> ?bwplot
> CO2 <- CO2  ## Make a copy in the global environment
> CO2$Type2 <- factor(as.character(CO2$Type), levels = c("Mississippi", 
> "Quebec"))
> with(CO2, table(Type, Type2))
 Type2
Type  Mississippi Quebec
  Quebec0 42
  Mississippi  42  0
> quartz() ## Your plot window type here (quartz() works on a Mac)
> bwplot( conc ~ Type : Treatment, data = CO2 )
> quartz()
> bwplot( conc ~ Type2 : Treatment, data = CO2 )
> 

Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre



From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
Eck, Bradley J [brad@mail.utexas.edu]
Sent: July 23, 2010 10:02 AM
To: r-help@r-project.org
Subject: [R] re-ordering bwplot

Dear list:

I'm using bwplot to compare concentrations by location and treatment as in:

# using built in data
bwplot( conc ~ Type : Treatment, data = CO2 )

I would like the order of the plots to be: 3,4,1,2.   I can't seem to figure 
this out with index.cond or permc.cond.

Any help is appreciated!

Brad Eck


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Can RMySQL run on Windows using a remote MySQL server?

2010-07-20 Thread Steven McKinney
So the answer is yes.

Install the MySQL Workbench application on your XP box
and work out the URL, account name and password, and
port connection issues to the reomte MySQL database.  
You should then have a MySQL driver available.  

Set up a User DSN or System DSN to the remote database 
using ODBC Data Source Administrator widget in 
Windows administrative tools.  

Install RMySQL, configure and enjoy.

Good luck!

Steven McKinney

From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
neatgadgets [neatgadg...@gmail.com]
Sent: July 19, 2010 3:26 PM
To: r-help@r-project.org
Subject: Re: [R] Can RMySQL run on Windows using a remote MySQL server?

I'm using Windows XP.
--
View this message in context: 
http://r.789695.n4.nabble.com/Can-RMySQL-run-on-Windows-using-a-remote-MySQL-server-tp2293533p2294705.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Can RMySQL run on Windows using a remote MySQL server?

2010-07-19 Thread Steven McKinney
I have done this from my previous computer, a Windows PC
running Windows XP (32 bit).  So in theory it can be done.
I had installed the MySQL Workbench application as well
which no doubt set up the MySQL dll's needed. 

My new computer is running Windows 7 64-bit for which
RMySQL is not yet built.

What OS are you running?

Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre




From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
neatgadgets [neatgadg...@gmail.com]
Sent: July 18, 2010 9:09 PM
To: r-help@r-project.org
Subject: [R] Can RMySQL run on Windows using a remote MySQL server?

Can RMySQL run on Windows using a remote MySQL server?
--
View this message in context: 
http://r.789695.n4.nabble.com/Can-RMySQL-run-on-Windows-using-a-remote-MySQL-server-tp2293533p2293533.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RMySQL package on 64bit R for Windows

2010-07-16 Thread Steven McKinney
I think this is a question for R-devel so I'm
cross-posting there with apologies.

I've just acquired a Windows 7 64-bit box and
also will need RMySQL eventually.

Is there any information about issues involved
with compiling RMySQL for Windows 64-bit?


Steven McKinney
Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre



> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf Of abeSRT
> Sent: July-16-10 1:00 PM
> To: r-help@r-project.org
> Subject: Re: [R] RMySQL package on 64bit R for Windows
> 
> 
> Did you ever find a solution for this?
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/RMySQL-package-on-64bit-R-for-Windows-
> tp2248726p2291892.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] inverse function of melt

2010-06-18 Thread Steven McKinney

?cast

A reproducible example would get you more feedback.

Steven McKinney


From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
n.via...@libero.it [n.via...@libero.it]
Sent: June 18, 2010 6:39 AM
To: r-help@r-project.org
Subject: [R] inverse function of melt

Dear list,
I'm looking for an inverse function of melt(which is in package 
reshape).Namely, I had a data frame like this
(Table1)

YEAR  VAR1  VAR2  VAR3
1995  7   3 45

1996   5 632

1997   6  10 15
I transformed my data by using the melt function and my data was reshaped in 
the following format:
(Table2)

YEARvariable  value

1995VAR1  7

1996 VAR1 5

1997 VAR1 6
1995VAR2  7

1996 VAR2 5

1997 VAR2 6
1995VAR3  7

1996 VAR3 5

1997 VAR3 6

 Now I would like to come back to the original format, namely table1.
Anyone could help me??
Thanks for your attention!






[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Competing with SPSS and SAS: improving code that loops throughrows (data manipulation)

2010-03-26 Thread Steven McKinney
Hi Dimitri

Your code is complex, so this won't be easy for anyone to deparse.

I think part of the issue is in the calculation you are doing in your innermost 
loop

data[data[[group.var]] %in% subgroup, name][case]=
1-((1-data[data[[group.var]] %in% subgroup,
name][case-1]*exp(1)^d)/(exp(1)^(data[data[[group.var]] %in% subgroup,
var][case]*l*10)))

This looks to me to be a lag-1 calculation, and not all lagged calculations are 
vectorizable, so
this part may have to be done in a loop.  If so, working on a matrix instead of 
a dataframe
can help speed up code dramatically, especially with big dataframes (I've seen 
10-fold
increases in speed by working with big matrices with no row or column names
instead of dataframes).

Pre-declare a matrix big enough to hold all your results outside of the loop, 
then do your calculations using the matrix inside the loop.  You can convert
the matrix to a dataframe after the loop completes if needed.

HTH

Steven McKinney



From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
Dimitri Liakhovitski [ld7...@gmail.com]
Sent: March 26, 2010 6:40 PM
To: Bert Gunter
Cc: r-help
Subject: Re: [R] Competing with SPSS and SAS: improving code that loops 
throughrows (data manipulation)

My sincere apologies if it looked large. Let me try again with less code.
It's hard to do less than that. In fact - there is nothing in this
code but 1 formula and many loops, which is the problem I am not sure
how to solve.
I also tried to be as clear as possible with the comments.
Dimitri

## START OF THE CODE TO PRODUCE SMALL DATA EXAMPLE
  set.seed(123)
  
data<-data.frame(group=c(rep("first",10),rep("second",10)),a=abs(round(rnorm(20,mean=0,
sd=.55),2)), b=abs(round(rnorm(20,mean=0, sd=.55),2)))
  data   # "data" it is the data frame to work with
## END OF THE CODE TO PRODUCE SMALL DATA EXAMPLE. In real life "data"
would contain up to 150-200 rows PER SUBGROUP

### Specifying useful parameters used in the slow code below:
vars<-names(data)[2:3]# names of variables used in
transformation; in real life - up to 50-60 variables
group.var<-names(data)[1]# name of the grouping variable
subgroups<-levels(data[[group.var]])   # names of subgroups; in real
life - up to 30 subgroups

# OBJECTIVE:
# Need to create new variables based on the old ones (a & b)
# For each new variable, the value in a given row is a function of (a)
2 constants (that have several levels each),
# (b) value of the original variable (e.g., a.ind.to.max"), and the
value in the previous row on the same new variable
# Plus - it has to be done by subgroup (variable "group")

# Defining 2 constants:
constant1<-c(1:3)# constant 1 used in transformation -
has 3 levels, in real life - up to 7 levels
constant2<-seq(.15,.45,.15)  # constant 2 used in transformation - has
3 levels, in real life - up to 7 levels

### CODE THAT IS SLOW. Reason - too many loops with the inner-most
loop being very slow - as it is looping through rows:

for(var in vars){   # looping through variables
  for(c1 in 1:length(constant1)){# looping through values of constant1
   for(c2 in 1:length(constant2)){   # looping through values of constant2
 d=log(0.5)/constant1[c1]
  l=-log(1-constant2[c2])
  name<-paste(var,constant1[c1],constant2[c2]*100,".transf",sep=".")
  data[[name]]<-NA
  for(subgroup in subgroups){ # looping through subgroups
data[data[[group.var]] %in% subgroup, name][1] =
1-((1-0*exp(1)^d)/(exp(1)^(data[data[[group.var]] %in% subgroup,
var][1]*l*10)))

  ### THIS SECTION IS THE SLOWEST - BECAUSE I AM LOOPING THROUGH ROWS:
 for(case in 2:nrow(data[data[[group.var]] %in% subgroup,
])){ # looping through rows
   data[data[[group.var]] %in% subgroup, name][case]=
1-((1-data[data[[group.var]] %in% subgroup,
name][case-1]*exp(1)^d)/(exp(1)^(data[data[[group.var]] %in% subgroup,
var][case]*l*10)))
 }
  ### END OF THE SLOWEST SECTION (INNERMOST LOOP)
  }
  }
   }
}

### END OF THE CODE

On Fri, Mar 26, 2010 at 5:25 PM, Bert Gunter  wrote:
> Dmitri:
>
> If you follow the R posting guide you're more likely to get useful replies.
> In particular it asks for **small** reproducible examples -- your example is
> far more code then I care to spend time on anyway (others may be more
> willing or more able to do so of course). I suggest you try (if you haven't
> already):
>
> 1. Profiling the code using Rprof to isolate where the time is spent.And
> then...
>
> 2. Writing a **small** reproducible example to exercise that portion of the
> code and post it with your question to the list. If you need to...
> Typically, 

Re: [R] Factor variables with GAM models

2010-03-19 Thread Steven McKinney
Hi Noah

GAM models were developed to assess the functional form
of the relationship of continuous predictor variables to the
response, so weren't really meant to handle factor variables
as predictor variables.  GAMs are of the form
E(Y | X1, X2, ...) = So + S(X1) + S(X2) + ...
where S(X) is a smooth function of X.

Hence you might want to rethink why you'd want a
factor variable as a predictor variable in a GAM.
This is why the gam machinery doesn't just do the
factor conversion to indicator variables as is done in
lm.

HTH

Steven McKinney


From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
Noah Silverman [n...@smartmediacorp.com]
Sent: March 19, 2010 12:54 PM
To: r-help@r-project.org
Subject: [R] Factor variables with GAM models

I'm just starting to learn about GAM models.

When using the lm function in R, any factors I have in my data set are
automatically converted into a series of binomial variables.

For example, if I have a data.frame with a column named color and values
"red", "green", "blue".   The lm function automatically replaces it with
3 variables colorred, colorgreen, colorblue which are binomial {0,1}

When I use the gam function, R doesn't do this so I get an error.

1) Is there a way to ask the gam function to do this conversion for me?
2) If not, is there some other tool or utility to make this data
transformation easy?
3) Last option - can I use lm to transform the data and then extract it
into a new data.frame to then pass to gam?

Thanks!!!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Merging Matrices

2010-03-19 Thread Steven McKinney

> From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf 
> Of duncandonutz [dwads...@unm.edu]
> Sent: March 19, 2010 1:11 PM
> To: r-help@r-project.org
> Subject: [R] Merging Matrices
> 
> I have two symmetric matrices, but of different dimensions.  The entries are
> identified by specific labels some of which are shared by both matrices.  I
> would like to sum the two matrices, but retain the union of the two.  In
> other words, I want the result to be the same size as the larger of the two
> matrices, but with the entries that they share added together.
> 
> cbind() and rbind() don't work since the matrices are different sizes
> (symmetrically).  Merge doesn't want to cooperate, although it might be
> because I can't understand the documentation. I tried the package "reshape"
> but didn't get very far with that either.  I tried simply adding (+) them,
> but that was a stupid first try.

With appropriate indexing of the matrices via their shared names, 
it can be done as follows:

> A <- matrix(1:9, nrow = 3)
> A[lower.tri(A)] <- A[upper.tri(A)]
> B <- matrix(1:25, nrow = 5)
> B[lower.tri(B)] <- B[upper.tri(B)]
> dimnames(A) <- list(letters[c(2, 4, 5)], letters[c(2, 4, 5)])
> dimnames(B) <- list(letters[1:5], letters[1:5])
> A
  b d e
b 1 4 7
d 4 5 8
e 7 8 9
> B
   a  b  c  d  e
a  1  6 11 16 21
b  6  7 12 17 22
c 11 17 13 18 23
d 12 18 22 19 24
e 16 21 23 24 25
> B[dimnames(A)[[1]], dimnames(A)[[1]]]
   b  d  e
b  7 17 22
d 18 19 24
e 21 24 25
> C <- B
> C[dimnames(A)[[1]], dimnames(A)[[1]]] <- A + B[dimnames(A)[[1]], 
> dimnames(A)[[1]]]
> C
   a  b  c  d  e
a  1  6 11 16 21
b  6  8 12 21 29
c 11 17 13 18 23
d 12 22 22 24 32
e 16 28 23 32 34
> C - B
  a b c d e
a 0 0 0 0 0
b 0 1 0 4 7
c 0 0 0 0 0
d 0 4 0 5 8
e 0 7 0 8 9
> 


HTH

Steve McKinney

> Any help is appreciated!
> --
> View this message in context: 
> http://n4.nabble.com/Merging-Matrices-tp1605474p1605474.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plot frame border to start at zero?

2010-01-19 Thread Steven McKinney


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of Douglas M. Hultstrand
> Sent: Tuesday, January 19, 2010 4:42 PM
> To: R mailing list
> Subject: [R] Plot frame border to start at zero?
> 
> Hello,
> 
> I am creating plots of hourly precipitation and accumulated
> precipitation (on different axis, see attached image). I was wondering
> how can I have the plot frame (black border) start at zero, it looks
> like it is plotted less than zero?

It doesn't appear to be plotted starting at less than zero.

S language plots were designed from the beginning to have
a little extra space around the data points so none of your
data points get lost or obscured because they overstrike the
borders.  Thus the surrounding box is drawn about 5% outside
of the data range in the X and Y axis directions, to avoid
any data point overstrike.

You can specify the x and y axis data ranges with the
xlim and ylim arguments if you want complete control
of the data range.  It can be done, but you will have to 
work hard to get the graph to do bad things like print 
over your data!  (See the various plot control arguments
in e.g. help("par") ).

> 
> The code I use to create the png files is below:
> 
> CairoPNG(PNG_file,width=1000, height=600, pointsize=14, bg="white")
> opar <- par(mai = c(.8, .8, 1, .8))# Set margins
> plot(ppt,type="h",col="dark grey",ylim=c(0,max_ppt+1), lwd=6,
> xlab='', ylab='', main=title, frame.plot=T, axes=F)
> axis(2, col="black", las=2)
> par(new=T) # R to overwrite the first plot
> plot(accum, type="l", col="blue", axes=F, ylab='', xlab='',
> lwd=2)
> axis(1, col="black", las=1)
> axis(4, pretty(c(0, max_accum) ), col="black", las=2)
> legend("topleft", legend=c('Incremental', 'Accumulated'),
> col=c("dark grey", "blue"), lwd=c(6, 2.0))
> legend((length(n)-4),max_accum,
> legend=paste(round(max_accum,2)), bty="n", cex=.90) ### max accum
> location
> mtext(paste("Lat:", .lat, "Lon:", .lon), cex=0.95)
> mtext("Index Hour", side=1, line=2)
> mtext("Incremental Precipitation (in)", side=2, line=2.5)
> mtext("Accumulated Precipitation (in)", side=4, line=2)
> dev.off()
> 
> 
> Thanks for all your help,
> Doug
> 
> -
> Douglas M. Hultstrand, MS
> Senior Hydrometeorologist
> Metstat, Inc. Windsor, Colorado
> voice: 970.686.1253
> email: dmhul...@metstat.com
> web: http://www.metstat.com
> -






Steven McKinney, Ph.D.

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: smckinney -at-  bccrc +dot+ ca
tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C.
V5Z 1L3

Canada

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help writing for loop

2009-11-25 Thread Steven McKinney
Hi,

The great thing about the S language is that it is 'vectorized',
so you can do a lot of matrix manipulations without loops.

Here's a smaller example of what you describe.
matrix A with 3 columns and 10 rows (instead of 100)
matrix B with 3 columns and 15 rows (instead of 1500)

> set.seed(123)
> mA <- matrix(runif(30), ncol = 3)
> mA
   [,1]   [,2]  [,3]
 [1,] 0.2875775 0.95683335 0.8895393
 [2,] 0.7883051 0.45333416 0.6928034
 [3,] 0.4089769 0.67757064 0.6405068
 [4,] 0.8830174 0.57263340 0.9942698
 [5,] 0.9404673 0.10292468 0.6557058
 [6,] 0.0455565 0.89982497 0.7085305
 [7,] 0.5281055 0.24608773 0.5440660
 [8,] 0.8924190 0.04205953 0.5941420
 [9,] 0.5514350 0.32792072 0.2891597
[10,] 0.4566147 0.95450365 0.1471136

> x <- seq(15)
> mB <- cbind(1, x, x^2)
> mB
 x
 [1,] 1  1   1
 [2,] 1  2   4
 [3,] 1  3   9
 [4,] 1  4  16
 [5,] 1  5  25
 [6,] 1  6  36
 [7,] 1  7  49
 [8,] 1  8  64
 [9,] 1  9  81
[10,] 1 10 100
[11,] 1 11 121
[12,] 1 12 144
[13,] 1 13 169
[14,] 1 14 196
[15,] 1 15 225

> mR <- apply(mB, 1, function(x){mA %*% x})
> mR
  [,1] [,2]  [,3]  [,4]  [,5] [,6] [,7]
 [1,] 2.133950 5.759401 11.163931 18.347540 27.310227 38.05199 50.57284
 [2,] 1.934443 4.466187  8.383538 13.686496 20.375061 28.44923 37.90901
 [3,] 1.727054 4.326145  8.206250 13.367368 19.809500 27.53265 36.53681
 [4,] 2.449921 6.005363 11.549346 19.081867 28.602929 40.11253 53.61067
 [5,] 1.699098 3.769140  7.150594 11.843459 17.847736 25.16342 33.79052
 [6,] 1.653912 4.679328  9.121806 14.981344 22.257943 30.95160 41.06232
 [7,] 1.318259 3.196545  6.162963 10.217513 15.360195 21.59101 28.90995
 [8,] 1.528621 3.353106  6.365876 10.566930 15.956267 22.53389 30.29979
 [9,] 1.168515 2.363915  4.137635  6.489674  9.420032 12.92871 17.01571
[10,] 1.558232 2.954077  4.644149  6.628448  8.906974 11.47973 14.34671
  [,8] [,9] [,10] [,11] [,12] [,13] [,14]
 [1,] 64.87276 80.95176  98.80984 118.44700 139.86324 163.05856 188.03295
 [2,] 48.75440 60.98539  74.60199  89.60419 105.99201 123.76542 142.92445
 [3,] 46.82198 58.38816  71.23536  85.36358 100.77281 117.46305 135.43430
 [4,] 69.09735 86.57257 106.03633 127.48863 150.92947 176.35884 203.77676
 [5,] 43.72904 54.97896  67.54029  81.41304  96.59720 113.09277 130.89975
 [6,] 52.59011 65.53495  79.89685  95.67582 112.87184 131.48493 151.51508
 [7,] 37.31703 46.81224  57.39559  69.06706  81.82667  95.67440 110.61027
 [8,] 39.25398 49.39646  60.72722  73.24626  86.95358 101.84919 117.93309
 [9,] 21.68102 26.92466  32.74662  39.14689  46.12549  53.68240  61.81763
[10,] 17.50792 20.96335  24.71302  28.75691  33.09502  37.72737  42.65394
  [,15]
 [1,] 214.78642
 [2,] 163.46908
 [3,] 154.68657
 [4,] 233.18322
 [5,] 150.01814
 [6,] 172.96229
 [7,] 126.63428
 [8,] 135.20527
 [9,]  70.53119
[10,]  47.87474
> 

so the results matrix mR has one column for each element of x,
and the i-th row element is the solution for
y = a + bx + cx^2
for the i-th row of coefficients in the parameter estimates matrix mA

You can spot check with e.g. 

> mA[1,] %*% mB[15,]
 [,1]
[1,] 214.7864
> mA[1,] %*% mB[5,]
 [,1]
[1,] 27.31023
> mA[1,]
[1] 0.2875775 0.9568333 0.8895393
> mB[5,]
x
 1  5 25 
> 



HTH

Steven McKinney, Ph.D.

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: smckinney +at+ bccrc +dot+ ca

tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C.
V5Z 1L3
Canada


From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
Jessica Schedlbauer [jsche...@fiu.edu]
Sent: November 25, 2009 10:43 AM
To: r-help@r-project.org
Subject: [R] help writing for loop

Hi,

I’d like to ask for some help in writing a loop.  My situation is the following:

I have a matrix (matrix.A) containing 3 columns and 100 rows.  The columns 
represent parameter estimates a, b, and c.  The rows contain different values 
for these parameter estimates.  Each row is unique.

I want to insert these parameter estimates into a model (say, y = a + bx + 
cx^2) and solve for y given a separate matrix (matrix.B) of x values (where x 
has a length of 1500).

I want to solve for y 100 times using each set of the parameter estimates in 
matrix.A once.

At present my code looks like this and it only performs the first iteration.

For (i in 1:length(matrix.A)) { y <- matrix.A$a[[i]] + matrixA$b[[i]] * 
matrix.B$x + matrixA$c[[i]] * matrix.B$x^2)

I have not been able to figure out how to loop through the rows of parameter 
estimates in matrix.A.  I am new to writing loops, so any assistance would be 
much appreciated.

Regards,
Jessica Schedlbauer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting

Re: [R] Where are usages like "== 2L" documented?

2009-11-16 Thread Steven McKinney

> -Original Message-
> From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com]
> Sent: Monday, November 16, 2009 4:52 PM
> To: Duncan Murdoch
> Cc: Steven McKinney; R Help
> Subject: Re: [R] Where are usages like "== 2L" documented?
> 
> On Mon, Nov 16, 2009 at 7:25 PM, Duncan Murdoch 
> wrote:
> > On 16/11/2009 6:47 PM, Steven McKinney wrote:
> >>
> >> ?NumericConstants
> >>
> >> will bring up a help page that mentions
> >> "All other numeric constants start with a digit or period and are
> either a
> >> decimal or hexadecimal constant optionally followed by L."
> >>
> >> and
> >>
> >> "An numeric constant immediately followed by L is regarded as an
> integer
> >> number when possible (and with a warning if it contains a "."). "

The word "integer" in the above sentence of the NumericConstants help
page is hyperlinked to the integer() function page.  There is then no
example or discussion of L there.

> >>
> >> but I haven't found discussion of it anywhere else in the help
> pages.
> >> Others may know what other help pages discuss this.
> >>
> >> I'm surprised that the help page invoked from
> >> ?integer
> >> does not discuss this.  Anyone know why not?
> >
> > This is part of the syntax of the language.  It has nothing to do
> with the
> > integer() function, which is what ?integer is asking about.
> 
> It might be useful to have a SeeAlso to NumericConstants on that help
> page for those who looked up ?integer thinking it might be about
> integer constants.

Yes, additional discussion of "L" would be very valuable.  I've had
several people ask me about usages, as this original poster did.
I think that increased use of L has outpaced updating of help entries.
Given that L is appearing in more places, I'd like to request additional
discussion of it and examples using it in help pages.

> class(1L)
[1] "integer"
> storage.mode(1L)
[1] "integer"

Since "integer" is the term often associated with this language construct,
that seems a natural place to say something about it, and direct users
to other appropriate help pages.

The help page for storage.mode() shows an example with "1i" in it,
could "1L" please also be added?  ("1.0" or "1." would also be useful.)

cex3 <- c("NULL","1","1:1","1i","list(1)","data.frame(x=1)",
  "pairlist(pi)", "c", "lm", "formals(lm)[[1]]",  "formals(lm)[[2]]",
  "y~x","expression((1))[[1]]", "(y~x)[[1]]",
  "expression(x <- pi)[[1]][[1]]")


The "L" language construct is often used in length checks such as
in the sample() function " if (length(x) == 1L ..."

The length() function help page discusses
" The default method currently returns an integer of length 1."
again with the "integer" hyperlinked to the integer() help page.
Since length() therefore can only assess integer lengths
from 0 to about 2^31 - 1 it would be helpful to discuss this
integer "L" construct and the range of values that can be expressed
with mode "integer" more fully somewhere in one of these help topics.


> sample
function (x, size, replace = FALSE, prob = NULL) 
{
if (length(x) == 1L && is.numeric(x) && x >= 1) {
if (missing(size)) 
size <- x
.Internal(sample(x, size, replace, prob))
}
else {
if (missing(size)) 
size <- length(x)
x[.Internal(sample(length(x), size, replace, prob))]
}
}



Best
Steve McKinney

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Where are usages like "== 2L" documented?

2009-11-16 Thread Steven McKinney


?NumericConstants

will bring up a help page that mentions 

"All other numeric constants start with a digit or period and are either a 
decimal or hexadecimal constant optionally followed by L."

and

"An numeric constant immediately followed by L is regarded as an integer number 
when possible (and with a warning if it contains a "."). "

but I haven't found discussion of it anywhere else in the help pages.
Others may know what other help pages discuss this.

I'm surprised that the help page invoked from
?integer
does not discuss this.  Anyone know why not?

Best
Steve McKinney




> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of Bryan Hanson
> Sent: Monday, November 16, 2009 3:23 PM
> To: R Help
> Subject: [R] Where are usages like "== 2L" documented?
> 
> Gurus:
> 
> I keep seeing other people¹s code that contain ideas like
> 
> If (x == 2L)
> X[-1L]
> X - 1L
> 
> I have some idea of what¹s going on, but where is the use of concepts
> like
> ³2L² documented?
> 
> Thanks, Bryan
> *
> Bryan Hanson
> Acting Chair
> Professor of Chemistry & Biochemistry
> DePauw University, Greencastle IN USA
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Adjusting Yaxis (ylim) limits on a plotMeans(DV, IV1, IV2, error.bars="se")

2009-11-05 Thread Steven McKinney
Hi Sergey

I've attached a script with a ylim argument added to plotMeans.

If you are using Rcmdr, you can load this script via the Rcmdr File menu item
File
  Open Script File

so save this plotMeans.R script somewhere,
and load it and run it in Rcmdr, then
do your plot command
 plotMeans(DV, IV1, IV2, error.bars="se", ylim = c(27, 99)) 
or whatever your desired ylim values are.

HTH
 

Steven McKinney, Ph.D.

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: smckinney +at+ bccrc +dot+ ca

tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C.
V5Z 1L3
Canada


From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
Sergios (Sergey) Charntikov [sergios...@gmail.com]
Sent: November 5, 2009 7:51 PM
To: r-help@r-project.org
Subject: [R] Adjusting Yaxis (ylim) limits on a plotMeans(DV, IV1, IV2, 
error.bars="se")

Hello everyone,

I have tried to look for this everywhere and so far have no luck.  I have a
plotMeans(DV, IV1, IV2, error.bars="se") graph that plots my data
(DV-continuous, IVs are factors, IV1 - two levels, IV2-four levels).  I am
trying to increase a scale of my y-axis (to be consistent with my other
graphs), but unfortunately nothing works with "plotMeans" function, which is
the only one I see so fat to plot what I need.  Could anyone provide any
suggestions or advice?

Thank you.

Sergey (UNL, Behavioral Psychopharmacology Lab)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

plotMeans <- 
function (response, factor1, factor2, error.bars = c("se", "sd", 
"conf.int", "none"), level = 0.95, xlab = deparse(substitute(factor1)), 
ylab = paste("mean of", deparse(substitute(response))), legend.lab = 
deparse(substitute(factor2)), 
main = "Plot of Means", pch = 1:n.levs.2, lty = 1:n.levs.2, 
col = palette(), ylim = NULL) 
{
if (!is.numeric(response)) 
stop(gettextRcmdr("Argument response must be numeric."))
xlab
ylab
legend.lab
error.bars <- match.arg(error.bars)
if (missing(factor2)) {
if (!is.factor(factor1)) 
stop(gettextRcmdr("Argument factor1 must be a factor."))
valid <- complete.cases(factor1, response)
factor1 <- factor1[valid]
response <- response[valid]
means <- tapply(response, factor1, mean)
sds <- tapply(response, factor1, sd)
ns <- tapply(response, factor1, length)
if (error.bars == "se") 
sds <- sds/sqrt(ns)
if (error.bars == "conf.int") 
sds <- qt((1 - level)/2, df = ns - 1, lower.tail = FALSE) * 
sds/sqrt(ns)
sds[is.na(sds)] <- 0
yrange <- if (error.bars != "none") 
c(min(means - sds, na.rm = TRUE), max(means + sds, 
na.rm = TRUE))
else range(means, na.rm = TRUE)
levs <- levels(factor1)
n.levs <- length(levs)
if ( is.null(ylim) ) {
  plot(c(1, n.levs), yrange, type = "n", xlab = xlab, ylab = ylab, 
   axes = FALSE, main = main)
} else {
  plot(c(1, n.levs), yrange, type = "n", xlab = xlab, ylab = ylab, 
   axes = FALSE, main = main, ylim = ylim)
}
points(1:n.levs, means, type = "b", pch = 16, cex = 2)
box()
axis(2)
axis(1, at = 1:n.levs, labels = levs)
if (error.bars != "none") 
arrows(1:n.levs, means - sds, 1:n.levs, means + sds, 
angle = 90, lty = 2, code = 3, length = 0.125)
}
else {
if (!(is.factor(factor1) | is.factor(factor2))) 
stop(gettextRcmdr("Arguments factor1 and factor2 must be factors."))
valid <- complete.cases(factor1, factor2, response)
factor1 <- factor1[valid]
factor2 <- factor2[valid]
response <- response[valid]
means <- tapply(response, list(factor1, factor2), mean)
sds <- tapply(response, list(factor1, factor2), sd)
ns <- tapply(response, list(factor1, factor2), length)
if (error.bars == "se") 
sds <- sds/sqrt(ns)
if (error.bars == "conf.int") 
sds <- qt((1 - level)/2, df = ns - 1, lower.tail = FALSE) * 
sds/sqrt(ns)
sds[is.na(sds)] <- 0
yrange <- if (error.bars != "none") 
c(min(means - sds,

Re: [R] cox regression extract strata as numeric

2009-10-27 Thread Steven McKinney

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of ?e???da? ?pa?t??
> Sent: Tuesday, October 27, 2009 3:51 PM
> To: r-help@r-project.org
> Subject: [R] cox regression extract strata as numeric
> 
> 
> Hi there,
> 
> I perform a stratified cox and then I need the strata as a  numeric
> array "straft.ln"
> ft.ln <-
> coxph(Surv(times,deaths)~ages+chemos+chemos:f1+chemos:f2+horms+horms:f1
> +horms:f2+grades+grades:f1+grades:f2+positives+positives:f1+positives:f
> 2+sizes+sizes:f1+sizes:f2+strata(stra),data=ddd)
> 
> basehazzft.ln=basehaz(ft.ln,centered=FALSE)
> H0ft.ln=c(basehazzft.ln[,1])
> timeft.ln=c(basehazzft.ln[,2])
> straft.ln=c(basehazzft.ln[,3])
> 
> I notice that basehazzft.ln[,3] is not the same with straft.ln with
> respect to the values i.e.:
> 
> (basehazzft.ln$stra[1])
> [1] stra=1
> 134 Levels: stra=1 stra=10 stra=100 stra=101 stra=102 ... stra=99
> > c(basehazzft.ln$stra[1])
> [1] 1
> 
> 
> so far so good but:
> 
> (basehazzft.ln$stra[285])
> [1] stra=2
> 134 Levels: stra=1 stra=10 stra=100 stra=101 stra=102 ... stra=99
> > c(basehazzft.ln$stra[285])
> [1] 47
> 
> while the desired value is 2, I get a 47. What am I doing wrong? I
> tried the as.numeric function but I have the same problem..
> 
> I just want instead of the [ "stra=1" "stra=1" "stra=1" etc]  the
> [1,1,1 etc]
> 
> Thanx in advance for any answers!

Here's one way to do it (using the example on coxph help page):

> require("survival")
Loading required package: survival
Loading required package: splines
> bladder1 <- bladder[bladder$enum < 5, ] 
> m1 <- coxph(Surv(stop, event) ~ (rx + size + number) * strata(enum) + 
> cluster(id), bladder1)
> basehazzft.ln=basehaz(m1,centered=FALSE)
> (basehazzft.ln$stra[1])
[1] enum=1
Levels: enum=1 enum=2 enum=3 enum=4
> basehazzft.ln$stra
 [1] enum=1 enum=1 enum=1 enum=1 enum=1 enum=1 enum=1 enum=1 enum=1 enum=1 
enum=1 enum=1 enum=1
[14] enum=1 enum=1 enum=1 enum=1 enum=1 enum=1 enum=1 enum=1 enum=2 enum=2 
enum=2 enum=2 enum=2
[27] enum=2 enum=2 enum=2 enum=2 enum=2 enum=2 enum=2 enum=2 enum=2 enum=2 
enum=2 enum=2 enum=2
[40] enum=2 enum=3 enum=3 enum=3 enum=3 enum=3 enum=3 enum=3 enum=3 enum=3 
enum=3 enum=3 enum=3
[53] enum=3 enum=3 enum=3 enum=3 enum=3 enum=4 enum=4 enum=4 enum=4 enum=4 
enum=4 enum=4 enum=4
[66] enum=4 enum=4 enum=4
Levels: enum=1 enum=2 enum=3 enum=4
> as.character(basehazzft.ln$stra)
 [1] "enum=1" "enum=1" "enum=1" "enum=1" "enum=1" "enum=1" "enum=1" "enum=1" 
"enum=1" "enum=1"
[11] "enum=1" "enum=1" "enum=1" "enum=1" "enum=1" "enum=1" "enum=1" "enum=1" 
"enum=1" "enum=1"
[21] "enum=1" "enum=2" "enum=2" "enum=2" "enum=2" "enum=2" "enum=2" "enum=2" 
"enum=2" "enum=2"
[31] "enum=2" "enum=2" "enum=2" "enum=2" "enum=2" "enum=2" "enum=2" "enum=2" 
"enum=2" "enum=2"
[41] "enum=3" "enum=3" "enum=3" "enum=3" "enum=3" "enum=3" "enum=3" "enum=3" 
"enum=3" "enum=3"
[51] "enum=3" "enum=3" "enum=3" "enum=3" "enum=3" "enum=3" "enum=3" "enum=4" 
"enum=4" "enum=4"
[61] "enum=4" "enum=4" "enum=4" "enum=4" "enum=4" "enum=4" "enum=4" "enum=4"
> substr(as.character(basehazzft.ln$stra), 6, 999)
 [1] "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" 
"1" "1" "1" "2" "2" "2"
[25] "2" "2" "2" "2" "2" "2" "2" "2" "2" "2" "2" "2" "2" "2" "2" "2" "3" "3" 
"3" "3" "3" "3" "3" "3"
[49] "3" "3" "3" "3" "3" "3" "3" "3" "3" "4" "4" "4" "4" "4" "4" "4" "4" "4" 
"4" "4"
> as.numeric(substr(as.character(basehazzft.ln$stra), 6, 999))
 [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 
2 2 2 3 3 3 3 3 3 3 3
[49] 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4
>

HTH

Steven McKinney
Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre



> 
> 
> 
> 
> 
> ___
> ×ñçóéìïðïéåßôå Yahoo!;
> ÂáñåèÞêáôå ôá åíï÷ëçôéêÜ ìçíýìáôá (spam); Ôï Yahoo! Mail
> äéáèÝôåé ôçí êáëýôåñç äõíáôÞ ðñïóôáóßá êáôÜ ôùí åíï÷ëçôéêþí
> ìçíõìÜôùí http://login.yahoo.com/config/mail?.intl=gr
> 
>   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to apply the Wilcoxon test to a hole table at once?

2009-10-23 Thread Steven McKinney

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of Iurie Malai
> Sent: Friday, October 23, 2009 10:46 AM
> To: r-help@r-project.org
> Subject: [R] How to apply the Wilcoxon test to a hole table at once?
> 
> 
> Hi,
> 
> I have a data set:
> 
> > Dataset
>  X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17
> 1 user1  m 22 19 28 24 12 18  9   7   4   5   4   7   5   7   9
> 2 user2  f 25 19 23 18 18 15  6   8   6   6   7  10   7   7   7
> 3 user3  f 28 21 24 18 15 12 10   6   7   9   5  10   5   9   5
> 4 user4  f 26 19 26 21 12 18  6   6   5   1   3   8   6   5   6
> 5 user5  m 21 22 26 18  9  6  4   6   1   7   2   4   4   6   4
> 6 user6  m 24  8 25 12 18 12  7   8   4   1   4   6   7   5   6
> ...
> 71 user71  m 18  4 10  6  3  6  9   5  10   8   4   5   6   5   5
> 
> I can apply the Wilcoxon test on each column one by one, but how to do
> this
> on the hole table at once?
> 
> > wilcox.test(X3 ~ X2, alternative="two.sided", data=Dataset)
> 
>   Wilcoxon rank sum test with continuity correction
> 
> data:  X3 by X2
> W = 439, p-value = 0.1291
> alternative hypothesis: true location shift is not equal to 0
> 
> >

Here's one way to do it (using airquality dataset)

> lapply(airquality[1:4], function(x) wilcox.test(x ~ Month, 
> alternative="two.sided", data=airquality, subset = Month <= 6))
$Ozone

Wilcoxon rank sum test with continuity correction

data:  x by Month 
W = 82, p-value = 0.1925
alternative hypothesis: true location shift is not equal to 0 


$Solar.R

Wilcoxon rank sum test with continuity correction

data:  x by Month 
W = 391.5, p-value = 0.8354
alternative hypothesis: true location shift is not equal to 0 


$Wind

Wilcoxon rank sum test with continuity correction

data:  x by Month 
W = 566, p-value = 0.1461
alternative hypothesis: true location shift is not equal to 0 


$Temp

Wilcoxon rank sum test with continuity correction

data:  x by Month 
W = 78, p-value = 2.400e-08
alternative hypothesis: true location shift is not equal to 0 


Warning messages:
1: In wilcox.test.default(x = c(41L, 36L, 12L, 18L, 28L, 23L, 19L,  :
  cannot compute exact p-value with ties
2: In wilcox.test.default(x = c(190L, 118L, 149L, 313L, 299L, 99L,  :
  cannot compute exact p-value with ties
3: In wilcox.test.default(x = c(7.4, 8, 12.6, 11.5, 14.3, 14.9, 8.6,  :
  cannot compute exact p-value with ties
4: In wilcox.test.default(x = c(67L, 72L, 74L, 62L, 56L, 66L, 65L,  :
  cannot compute exact p-value with ties
 
# Sanity check:
> wilcox.test(Temp ~ Month, alternative="two.sided", data=airquality, subset = 
> Month <= 6)

Wilcoxon rank sum test with continuity correction

data:  Temp by Month 
W = 78, p-value = 2.400e-08
alternative hypothesis: true location shift is not equal to 0 

Warning message:
In wilcox.test.default(x = c(67L, 72L, 74L, 62L, 56L, 66L, 65L,  :
  cannot compute exact p-value with ties
>
# Same as lapply loop result


HTH

Steven McKinney
Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre





> 
> I researched on this, but I can't find a solution.
> I would really appreciate any help.
> 
> P.S. Excuse my lack of terminology :).
> 
> Iurie Malai
> Moldova Pedagogical State University
> --
> View this message in context: http://www.nabble.com/How-to-apply-the-
> Wilcoxon-test-to-a-hole-table-at-once--tp26030572p26030572.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Navigate to Index page of a package from R command prompt

2009-07-24 Thread Steven McKinney
Okay, I've worked out much of the details on my PC and Mac.

What I'd like to be able to do is enter the command

> help(package = "survival")

or

> ?survival::

and get the usual hyperlinked help page displayed instead
of the static "text only" display or an error message.

The help() function returns an object of 
class "help_files_with_topic".
The object consists of a character vector
with several attributes.

PC:  Windows XP

> library("survival")
> foo <- help("aareg", package = "survival")
> class(foo)
[1] "help_files_with_topic"
> foo[1]
[1] "C:/PROGRA~1/R/R-29~1.1/library/survival/chm/aareg"
> attributes(foo)
$call
help(topic = "aareg", package = "survival")

$pager
[1] "internal"

$topic
[1] "aareg"

$tried_all_packages
[1] FALSE

$type
[1] "chm"

$class
[1] "help_files_with_topic"

> bar <- help("", package = "survival")
> class(bar)
[1] "help_files_with_topic"
> bar[1]
[1] NA
> attributes(bar)
$call
help(topic = "", package = "survival")

$pager
[1] "internal"

$topic
[1] ""

$tried_all_packages
[1] FALSE

$type
[1] "chm"

$class
[1] "help_files_with_topic"

If I alter the character vector to
point to "00Index"

> bar[1] <- "C:/PROGRA~1/R/R-29~1.1/library/survival/chm/00Index"
> bar

I see exactly what I've been attempting to achieve.


Mac OS X:

> foo <- help("aareg", package = "survival")
> foo[1]
[1] "/Library/Frameworks/R.framework/Resources/library/survival/html/aareg.html"
> attributes(foo)
$call
help(topic = "aareg", package = "survival")

$pager
[1] "/Library/Frameworks/R.framework/Resources/bin/pager"

$topic
[1] "aareg"

$tried_all_packages
[1] FALSE

$type
[1] "html"

$class
[1] "help_files_with_topic"

> bar <- help("", package = "survival")
> bar[1]
[1] NA
> bar[1] <- 
> "/Library/Frameworks/R.framework/Resources/library/survival/html/00Index.html"
> bar

Again I see exactly what I've been after.

Running R in Emacs on Mac OS X:

> foo <- help(topic = "aareg", package = "survival")
> foo[1]
[1] "/Library/Frameworks/R.framework/Resources/library/survival/html/aareg.html"
> attributes(foo)
$call
help(topic = "aareg", package = "survival")

$pager
[1] "/Library/Frameworks/R.framework/Resources/bin/pager"

$topic
[1] "aareg"

$tried_all_packages
[1] FALSE

$type
[1] "html"

$class
[1] "help_files_with_topic"

> bar <- help(topic = "", package = "survival")
> bar[1]
[1] NA
> bar[1] <- 
> "/Library/Frameworks/R.framework/Resources/library/survival/html/00Index.html"
> bar
Help for '' is shown in browser /usr/bin/open ...
Use
help("", htmlhelp = FALSE)
or
options(htmlhelp = FALSE)
to revert.
>

Again, what I've been trying to achieve.

When a user loads a new library and doesn't yet know any function names,
I think
> help(package = "newLibrary")
or
> ?newLibrary::

should perform the above action whenever possible instead of 
producing static help or an error message.

The "00Index" 'object' should be available for such use
whenever it exists.  (I've not yet investigated all the
details to answer when it is and is not created in the
Help building exercise of package building.)

I can work out the coding details to make this happen,
but am I missing some key point?  Any reasons why this
would be a Bad Idea?



Best

Steven McKinney, Ph.D.

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: smckinney -at-  bccrc +dot+ ca
tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C.
V5Z 1L3

Canada







> -Original Message-
> From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com]
> Sent: Thursday, July 23, 2009 4:48 PM
> To: Steven McKinney
> Cc: R-help@r-project.org
> Subject: Re: [R] Navigate to Index page of a package from R command
> prompt
> 
> Try
> 
> enter at the R console: help.start()
> and then when the help comes up in your browser click on Packages
> and then click on the package you want
> and then click on the help file you want
> 
> On Thu, Jul 23, 2009 at 3:30 PM, Steven McKinney
> wrote:
> >
> > Hi all,
> >
> > Is there a way to navigate directly to the "Index" page of help
> > for a package?
> >
> > Here's my connundrum:
> >
> > I download and install

Re: [R] Navigate to Index page of a package from R command prompt

2009-07-23 Thread Steven McKinney

> -Original Message-
> From: Marc Schwartz [mailto:marc_schwa...@me.com]
> Sent: Thursday, July 23, 2009 2:04 PM
> To: Steven McKinney
> Cc: R-help@r-project.org
> Subject: Re: [R] Navigate to Index page of a package from R command
> prompt
> 
> On Jul 23, 2009, at 3:30 PM, Steven McKinney wrote:
> 
> >
> > Hi all,
> >
> > Is there a way to navigate directly to the "Index" page of help
> > for a package?
> >
> > Here's my connundrum:
> >
> > I download and install package "foo".
> > I don't know what functions are in package "foo",
> > so I can't invoke the help for package "foo" via
> >> ?someFunction
> >
> > help(package = "foo")
> > pops up some non-hyperlinked information page, not
> > package foo's help Index page.
> >
> > If the package author kindly made a "foo" object or function
> > and put that in package "foo", then
> >> ?foo
> > works and yields a help page for package "foo".
> > Now at the bottom of the help page is a hyperlink "Index"
> > and I can click that to navigate to the main help Index page
> > (the page I really want to get to straight from the R
> > command line).
> >
> > I see that the link to "Index" for package "foo" appears always to be
> > (on my Mac)
> >
> file:///Library/Frameworks/R.framework/Resources/library/foo/html/00Ind
> ex.html
> >
> > e.g.
> >
> file:///Library/Frameworks/R.framework/Resources/library/cmprsk/html/00
> Index.html
> >
> >
> file:///Library/Frameworks/R.framework/Resources/library/utils/html/00I
> ndex.html
> >
> > Is there a command from the R listener that can take me directly to
> > this "00Index.html" page of help for package "foo"?
> >
> > something like
> >> help("00Index", package = "utils")
> >
> > (but this does not work)?
> >
> > Any info appreciated
> >
> > Best
> >
> > Steven McKinney
> 
> 
> Steven,
> 
> Once a package has been loaded with 'library(PackageName)', you can
> use .path.package("PackageName") to get the path to the main package
> installation folder and then append the remainder:
> 
> library(survival)
> 
>  > .path.package("survival")
> [1] "/Library/Frameworks/R.framework/Resources/library/survival"
> 
>  > file.path(.path.package("survival"), "html", "00Index.html")
> [1] "/Library/Frameworks/R.framework/Resources/library/survival/html/
> 00Index.html"
> 
> See ?.path.package and ?file.path for more information.
> 
> You can also use .find.package() for unloaded packages and that is
> described on the same help page as .path.package.
> 
> Finally, you can use browseURL() to bring the index file up in your
> browser:
> 
>browseURL(file.path(.path.package("survival"), "html",
> "00Index.html"))
> 
> HTH,
> 
> Marc Schwartz

Thanks Marc,

Certainly useful information.

I'm just baffled and confused as to why
> ?survival
or
> ?survival:Index
or some such incantation
doesn't just take me right to the index page 
for package "survival" in the default help browser 
for the R session.

Why does it take so much work to get to the 
index page?

> library("survival")
> browseURL(file.path(.path.package("survival"), "html", "00Index.html"))
does conceptually do what I'm looking for, but 
not in the default browser.

I'd like to file a feature request on R-devel if this
is not too naïve of a request.

?survival:Index 
or some such syntax should do what you illustrate
above, but in the default help mechanism.
I don't yet know enough about the ? and ?? machinery
to develop such a capability.  Any ideas?




Best

Steve McKinney

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mathematical notation in R

2009-07-23 Thread Steven McKinney
Hi Mary,

Something such as

> plot(1,1)
> title(expression(paste("LBAuo = -", infinity)))

Is this what you're after?

Best

Steve McKinney


From: Mary A. Marion [mms...@comcast.net]
Sent: July 23, 2009 4:23 PM
To: Steven McKinney
Subject: Re: [R] mathematical notation in R

Hi,

LBAuo <- -Inf was helpful but what I really would like to do is use the "Greek" 
expression for
infinity that is the number 8 on its side.

syntax   meaning


infinityinfinity symbol
was found using ?plotmath but I don't know how to use this information.  Any 
ideas?

Thank you.

Sincerely,
Mary A. Marion


Steven McKinney wrote:

Does this approach what you're looking for?
See
?is.finite

for more info



LBAuo<- -
LBAuo


[1] -


if (LBAuo <= -9999) LBAuo <- -Inf
LBAuo


[1] -Inf



HTH
Steven McKinney



-Original Message-
From: r-help-boun...@r-project.org<mailto:r-help-boun...@r-project.org> 
[mailto:r-help-boun...@r-
project.org] On Behalf Of Mary A. Marion
Sent: Wednesday, July 22, 2009 8:00 PM
To: r-help@r-project.org<mailto:r-help@r-project.org>
Subject: [R] mathematical notation in R

Hello,

I have an R function which includes the statement LBAuo<- -.
Rather
than printing out - I want it to print out - infinity.
As of yet I have not been able to do that.  I am not in a graphics
window just outputting a set of variables.  Can you help?
Thanks.

Sincerely,
Mary A. Marion

__
R-help@r-project.org<mailto:R-help@r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Navigate to Index page of a package from R command prompt

2009-07-23 Thread Steven McKinney

Hi all,

Is there a way to navigate directly to the "Index" page of help
for a package?

Here's my connundrum:

I download and install package "foo".
I don't know what functions are in package "foo",
so I can't invoke the help for package "foo" via
> ?someFunction

help(package = "foo")
pops up some non-hyperlinked information page, not
package foo's help Index page.

If the package author kindly made a "foo" object or function
and put that in package "foo", then 
> ?foo
works and yields a help page for package "foo".
Now at the bottom of the help page is a hyperlink "Index"
and I can click that to navigate to the main help Index page
(the page I really want to get to straight from the R
command line).

I see that the link to "Index" for package "foo" appears always to be
(on my Mac)
file:///Library/Frameworks/R.framework/Resources/library/foo/html/00Index.html

e.g.
file:///Library/Frameworks/R.framework/Resources/library/cmprsk/html/00Index.html

file:///Library/Frameworks/R.framework/Resources/library/utils/html/00Index.html

Is there a command from the R listener that can take me directly to
this "00Index.html" page of help for package "foo"?

something like
> help("00Index", package = "utils")

(but this does not work)?

Any info appreciated

Best

Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mathematical notation in R

2009-07-22 Thread Steven McKinney
Does this approach what you're looking for?
See 
?is.finite

for more info

> LBAuo<- -
> LBAuo
[1] -
> if (LBAuo <= -) LBAuo <- -Inf
> LBAuo
[1] -Inf
>

HTH
Steven McKinney

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of Mary A. Marion
> Sent: Wednesday, July 22, 2009 8:00 PM
> To: r-help@r-project.org
> Subject: [R] mathematical notation in R
> 
> Hello,
> 
> I have an R function which includes the statement LBAuo<- -.
> Rather
> than printing out - I want it to print out - infinity.
> As of yet I have not been able to do that.  I am not in a graphics
> window just outputting a set of variables.  Can you help?
> Thanks.
> 
> Sincerely,
> Mary A. Marion
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Referencing data frames

2009-06-12 Thread Steven McKinney
Hi Payam,

Here's a basic example using pointless
data frames and an otherwise useless
function to illustrate the issues you
want 

> ## Fresh R session with nothing yet defined
> foo <- data.frame(matrix(1:12, ncol = 3))
> bar <- data.frame(matrix(101:112, ncol = 3))
> 
> objects()
[1] "bar" "foo"
> 
> myrbind <- function(names) {
+   df1 <- get(names[1])
+   df2 <- get(names[2])
+   dfrbind <- rbind(df1, df2)
+   dfrbind.name <- paste(names, collapse = ".")
+   assign(dfrbind.name, dfrbind, pos = 1)
+   invisible()
+ }
> 
> dfnames <- c("foo", "bar", "baz")
> myrbind(dfnames)
> objects()
[1] "bar" "dfnames" "foo" "foo.bar.baz" "myrbind"
> foo.bar.baz
   X1  X2  X3
1   1   5   9
2   2   6  10
3   3   7  11
4   4   8  12
5 101 105 109
6 102 106 110
7 103 107 111
8 104 108 112
> 


HTH


Steven McKinney, Ph.D.

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: smckinney +at+ bccrc +dot+ ca

tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3
Canada




-Original Message-
From: r-help-boun...@r-project.org on behalf of Payam Minoofar
Sent: Fri 6/12/2009 3:00 PM
To: r-help@r-project.org
Subject: [R] Referencing data frames
 
Hi,

How do I use the string content of a string variable to reference a data frame 
of the same name? I want to do the typical tasks of 1) building a name with a 
string variable and using the string variable to create a data frame (or any 
object) whose name is the string value of the variable and 2) pass on a string 
to a function as a parameter, and then use that string to refer to an existing 
data frame.

Thanks in advance.

Payam
--
Payam Minoofar, Ph.D.
Scientist
Meissner Filtration Products
4181 Calle Tesoro
Camarillo, CA 93012
USA
+1 805 388 9911
+1 805 388 5948 fax
payam.minoo...@meissner.com


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Need help understanding output from aov and from anova

2009-06-03 Thread Steven McKinney
Hi Suman,

What version of R are you running?

In R 2.9.0 running your first example yields a warning

 Warning message:
 In anova.lm(lm(vtot ~ fac)) :
   ANOVA F-tests on an essentially perfect fit are unreliable

so some adept R developer has taken the time to figure
out how to warn you about such a problem.

Perhaps someone will add this to aov() at some point as well.

The only variability in this problem is that introduced
by machine precision rounding errors.

The exercise of submitting data with no variability to 
a program designed to assess variability cannot be expected 
to produce meaningful output, so there's nothing to
understand except the issue of machine precision.
Machine roundoff error is an important topic, so I'd
recommend learning about that issue, which will do most
to help understand these examples.

Best

SteveM


R version 2.9.0 (2009-04-17)
Copyright (C) 2009 The R Foundation for Statistical Computing
ISBN 3-900051-07-0

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> vtot=c(7.29917, 7.29917, 7.29917)  #identical values
> fac=as.factor(c(1,1,2))   #group 1 has first two elements, group 2 has
> anova(lm(vtot~fac))
Analysis of Variance Table

Response: vtot
  Df Sum SqMean Sq F value Pr(>F)
fac1 1.6818e-30 1.6818e-30  0. 0.6667
Residuals  1 5.0455e-30 5.0455e-30   
Warning message:
In anova.lm(lm(vtot ~ fac)) :
  ANOVA F-tests on an essentially perfect fit are unreliable
> 
> summary(aov(vtot~fac))
Df Sum SqMean Sq F value Pr(>F)
fac  1 1.6818e-30 1.6818e-30  0. 0.6667
Residuals1 5.0455e-30 5.0455e-30   
> 
> fac=as.factor(c(1,2,2))
> anova(lm(vtot~fac))
Analysis of Variance Table

Response: vtot
  Df Sum SqMean SqF valuePr(>F)
fac1 6.7274e-30 6.7274e-30 1.3340e+32 < 2.2e-16 ***
Residuals  1  5.043e-62  5.043e-62 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
Warning message:
In anova.lm(lm(vtot ~ fac)) :
  ANOVA F-tests on an essentially perfect fit are unreliable
>

 

 

Steven McKinney, Ph.D.

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: smckin...@bccrc.ca
tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3

Canada


 

 


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of Suman Sundaresh
> Sent: Wednesday, June 03, 2009 3:55 PM
> To: r-help@r-project.org
> Subject: [R] Need help understanding output from aov and from anova
> 
> Hi all,
> 
> I noticed something strange when I ran aov and anova.
> 
> vtot=c(7.29917, 7.29917, 7.29917)  #identical values
> fac=as.factor(c(1,1,2))   #group 1 has first two elements, group 2 has
> the 3rd element
> 
> When I run:
> > anova(lm(vtot~fac))
> Analysis of Variance Table
> 
> Response: vtot
>   Df Sum SqMean Sq F value Pr(>F)
> fac1 1.6818e-30 1.6818e-30  0. 0.6667
> Residuals  1 5.0455e-30 5.0455e-30
> 
> 
> I get a p-value of 0.667. This seems strange to me. I would have
> expected the p-value to be NaN.
> 
> Again, when I run:
> > summary(aov(vtot~fac))
> Df Sum SqMean Sq F value Pr(>F)
> fac  1 1.6818e-30 1.6818e-30  0. 0.6667
> Residuals1 5.0455e-30 5.0455e-30
> 
> Again same p-value.
> 
> 
> Now, if I set fac to c(1,2,2) which is essentially just switching the
> groups.
> fac=as.factor(c(1,2,2))
> 
> And run,
> > anova(lm(vtot~fac))
> Analysis of Variance Table
> 
> Response: vtot
>   Df Sum SqMean SqF valuePr(>F)
> fac1 6.7274e-30 6.7274e-30 1.3340e+32 < 2.2e-16 ***
> Residuals  1  5.043e-62  5.043e-62
> ---
> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> 
> 
> The p-value is really significant which again looks very strange.
> 
> Please could someone shed some light on what I may be missing here?
> 
> Thanks very much.
> Suman.
> 
> 

Re: [R] Warning trying to plot -log(log(survival))

2009-05-11 Thread Steven McKinney
Hi John,

Thanks for the data.  I put your data in a
plain text file and read it in without any problems.

> jsdf <- read.csv("js.csv")
> jsdf
   Time Time30 Died Age Rx
1 3  0.1000  40  2
2 8  0.2671  21  2
310  0.3331  18  2
412  0.4000  42  2
516  0.5331  23  2
...

I can fit the model you specified and obtain the plot without problem.
So this looks like an issue with R 2.8.1 and whatever survival
package version you have.  I have R 2.9.0 and survival_2.35-4.
Can you update your versions?

HTH

> fit0 <- coxph(Surv(Time30, Died) ~ strata(Rx) + Age, data = jsdf)
> fit0
Call:
coxph(formula = Surv(Time30, Died) ~ strata(Rx) + Age, data = jsdf)


  coef exp(coef) se(coef)z p
Age 0.0611  1.06   0.0257 2.38 0.017

Likelihood ratio test=5.84  on 1 df, p=0.0156  n= 64 
> plot(survfit(fit0),fun="cloglog")
### No warning, two curves plotted.

> sessionInfo()
R version 2.9.0 (2009-04-17) 
i386-pc-mingw32 

locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

attached base packages:
[1] splines   stats graphics  grDevices utils datasets  methods

[8] base 

other attached packages:
[1] survival_2.35-4
>

 

 

Steven McKinney 

 


> -Original Message-
> From: John Sorkin [mailto:jsor...@grecc.umaryland.edu]
> Sent: Monday, May 11, 2009 8:53 PM
> To: Steven McKinney; r-h...@stat.math.ethz.ch
> Subject: RE: [R] Warning trying to plot -log(log(survival))
> 
> Time30 is simply Time/30 and reflects months of follow-up.
> A copy of my data follows:
> Time,Time30,Died,Age,Rx
> 3,0.1,0,40,2
> 8,0.26667,1,21,2
> 10,0.3,1,18,2
> 12,0.4,0,42,2
> 16,0.5,1,23,2
> 17,0.56667,1,21,2
> 22,0.7,1,13,2
> 64,2.1,0,20,2
> 65,2.16667,0,15,2
> 77,2.56667,0,34,2
> 82,2.7,0,14,2
> 98,3.26667,0,10,2
> 155,5.16667,0,27,2
> 189,6.3,0,9,2
> 199,6.6,0,19,2
> 247,8.2,0,14,2
> 324,10.8,0,23,2
> 356,11.8667,0,13,2
> 378,12.6,0,34,2
> 408,13.6,0,27,2
> 411,13.7,0,5,2
> 420,14,0,23,2
> 449,14.9667,0,37,2
> 490,16.,0,37,2
> 528,17.6,0,32,2
> 547,18.2333,0,32,2
> 691,23.0333,0,38,2
> 769,25.6333,0,18,2
> ,37.0333,0,20,2
> 1173,39.1,0,12,2
> 1213,40.4333,0,12,2
> 1357,45.2333,0,29,2
> 9,0.3,1,35,1
> 11,0.36667,1,27,1
> 12,0.4,1,22,1
> 20,0.7,1,21,1
> 20,0.7,1,30,1
> 22,0.7,1,7,1
> 25,0.8,1,36,1
> 25,0.8,1,38,1
> 25,0.8,0,20,1
> 28,0.9,1,25,1
> 28,0.9,1,28,1
> 31,1.0,1,17,1
> 35,1.16667,1,21,1
> 35,1.16667,1,25,1
> 46,1.5,1,35,1
> 49,1.6,1,19,1
> 104,3.46667,0,27,1
> 106,3.5,0,19,1
> 156,5.2,0,15,1
> 218,7.26667,0,26,1
> 230,7.7,0,11,1
> 231,7.7,0,14,1
> 316,10.5333,0,15,1
> 393,13.1,0,27,1
> 395,13.1667,0,2,1
> 428,14.2667,0,3,1
> 469,15.6333,0,14,1
> 602,20.0667,0,18,1
> 681,22.7,0,23,1
> 690,23,0,9,1
> 1112,37.0667,0,11,1
> 1180,39.,0,11,1
> 
> 
> 
> 
> John David Sorkin M.D., Ph.D.
> Chief, Biostatistics and Informatics
> University of Maryland School of Medicine Division of Gerontology
> Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR)
> Baltimore, MD 21201-1524
> (Phone) 410-605-7119
> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
> 
> >>> Steven McKinney  5/11/2009 10:04 PM >>>
> Hi John,
> 
> I can't reproduce your case with built-in data sets.
> 
> Your data set is small.  Can you show the data for the variables
> involved in your example?
> (Time30, Died, Rx, Age)
> 
> 
> 
> 
> 
> Steven McKinney, Ph.D.
> 
> Statistician
> Molecular Oncology and Breast Cancer Program British Columbia Cancer
> Research Centre
> 
> email: smckin...@bccrc.ca
> tel: 604-675-8000 x7561
> 
> BCCRC
> Molecular Oncology
> 675 West 10th Ave, Floor 4
> Vancouver B.C.
> V5Z 1L3
> 
> Canada
> 
> 
> 
> 
> 
> 
> 
> > -Original Message-
> > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> > project.org] On Behalf Of John Sorkin
> > Sent: Monday, May 11, 2009 12:19 PM
> > To: r-h...@stat.math.ethz.ch
> > Subject: [R] Warning trying to plot -log(log(survival))
> >
> > windows xp
> > R 2.8.1
> >
> > I am trying to plot the -log(log(survival)) to visually test the
> > proportional hazards assumption of a Cox regression. The plot, wh

Re: [R] Warning trying to plot -log(log(survival))

2009-05-11 Thread Steven McKinney
Hi John,

I can't reproduce your case with built-in data sets.

Your data set is small.  Can you show the data for the
variables involved in your example?
(Time30, Died, Rx, Age)

 

 

Steven McKinney, Ph.D.

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: smckin...@bccrc.ca
tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3

Canada


 

 


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of John Sorkin
> Sent: Monday, May 11, 2009 12:19 PM
> To: r-h...@stat.math.ethz.ch
> Subject: [R] Warning trying to plot -log(log(survival))
> 
> windows xp
> R 2.8.1
> 
> I am trying to plot the -log(log(survival)) to visually test the
> proportional hazards assumption of a Cox regression. The plot, which
> should give two lines (one for each treatment) gives only one line and
> a warning message. I would appreciate help getting two lines, and an
> explanation of the warning message. My problem may the that I have
very
> few events in one of my strata, but I don't know.
> Thanks,
> John
> 
> fit0<-coxph(Surv(Time30,Died)~strata(Rx)+Age,data=GVHDdata)
> plot(survfit(fit0),fun="cloglog")
> 
> 
> WARNING: Warning in xy.coords (x, y, xlabel, ylabel, log)  :
>  2 x values <=0 omitted from logarithmic plot
> 
> 
> 
> > print(survfit(fit0),fun="cloglog")
> Call: survfit.coxph(object = fit0)
> 
>   n events median 0.95LCL 0.95UCL
> Rx=0 31  5Inf Inf Inf
> Rx=1 32 15Inf1.17 Inf
> 
> John David Sorkin M.D., Ph.D.
> Chief, Biostatistics and Informatics
> University of Maryland School of Medicine Division of Gerontology
> Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR)
> Baltimore, MD 21201-1524
> (Phone) 410-605-7119
> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
> 
> Confidentiality Statement:
> This email message, including any attachments, is for
> th...{{dropped:6}}
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Corrupt data frame construction - bug?

2009-04-29 Thread Steven McKinney
Thanks Duncan,

Comments and a proposed bug fix in-line below:
 

> -Original Message-
> From: Duncan Murdoch [mailto:murd...@stats.uwo.ca]
> Sent: Wednesday, April 29, 2009 5:10 PM
> To: Steven McKinney
> Cc: R-help@r-project.org
> Subject: Re: [R] Corrupt data frame construction - bug?
> 
> On 29/04/2009 6:41 PM, Steven McKinney wrote:
> > Hi useRs,
> >
> > A recent coding infelicity along these lines yielded a corrupt data
> > frame.
> >
> > foo <- matrix(1:12, nrow = 3)
> > bar <- data.frame(foo)
> > bar$NewCol <- foo[foo[, 1] == 4, 4]
> > bar
> > lapply(bar, length)
> >
> >
> >
> >
> >> foo <- matrix(1:12, nrow = 3)
> >> bar <- data.frame(foo)
> >> bar$NewCol <- foo[foo[, 1] == 4, 4]
> >> bar
> >   X1 X2 X3 X4 NewCol
> > 1  1  4  7 10   
> > 2  2  5  8 11   
> > 3  3  6  9 12   
> > Warning message:
> > In format.data.frame(x, digits = digits, na.encode = FALSE) :
> >   corrupt data frame: columns will be truncated or padded with NAs
> >> lapply(bar, length)
> > $X1
> > [1] 3
> >
> > $X2
> > [1] 3
> >
> > $X3
> > [1] 3
> >
> > $X4
> > [1] 3
> >
> > $NewCol
> > [1] 0
> >
> >
> > Is this a bug in the data.frame machinery?
> > If an attempt is made to add a new column to a data frame, and the
> new
> > object does not have length = number of rows of data frame, or
cannot
> > be made to have such length via recycling, shouldn't an error be
> > thrown?
> >
> > Instead in this example I end up with a "corrupt data frame" having
> > one zero-length column.
> >
> >
> > Should this be reported as a bug, or did I misinterpret the
> > documentation?
> 
> I don't think "$" uses any data.frame machinery.  You are working at a
> lower level.
> 
> If you had added the new column using
> 
> bar <- data.frame(bar, NewCol=foo[foo[, 1] == 4, 4])
> 
> you would have seen the error:
> 
> Error in data.frame(bar, NewCol = foo[foo[, 1] == 4, 4]) :
>arguments imply differing number of rows: 3, 0
> 
> But since you treated it as a list, it let you go ahead and create
> something that was labelled as a data.frame but wasn't.  This is one
of
> the reasons some people prefer S4 methods:  it's easier to protect
> against people who mislabel things.
> 

I did some more digging on '$' - there is a data.frame method for it:

> getAnywhere("$<-.data.frame" )
A single object matching '$<-.data.frame' was found
It was found in the following places
  package:base
  registered S3 method for $<- from namespace base
  namespace:base
with value

function (x, i, value) 
{
cl <- oldClass(x)
class(x) <- NULL
nrows <- .row_names_info(x, 2L)
if (!is.null(value)) {
N <- NROW(value)
if (N > nrows) 
stop(gettextf("replacement has %d rows, data has %d", 
N, nrows), domain = NA)
if (N < nrows && N > 0L) 
if (nrows%%N == 0L && length(dim(value)) <= 1L) 
value <- rep(value, length.out = nrows)
else stop(gettextf("replacement has %d rows, data has %d", 
N, nrows), domain = NA)
if (is.atomic(value)) 
names(value) <- NULL
}
x[[i]] <- value
class(x) <- cl
return(x)
}
>


I placed a browser() command before return(x) and did some poking
around.

It seems to me there's a bug in this function.  It should be able to
detect the problem I threw at it, and throw an error as you point out is
thrown by the other data.frame assign method.


I modified the rows
  if (N < nrows && N > 0L) 
if (nrows%%N == 0L && length(dim(value)) <= 1L)
to read
   if (N < nrows) 
if (N > 0L && nrows%%N == 0L && length(dim(value)) <= 1L)

as in

"$<-.data.frame" <-
function (x, i, value) 
{
cl <- oldClass(x)
class(x) <- NULL
nrows <- .row_names_info(x, 2L)
if (!is.null(value)) {
N <- NROW(value)
if (N > nrows) 
stop(gettextf("replacement has %d rows, data has %d", 
N, nrows), domain = NA)
if (N < nrows) 
if (N > 0L && nrows%%N == 0L && length(dim(value)) <= 1L) 
value <- rep(value, length.out = nrows)
else stop(gettextf("replacement has %d rows, data has %d", 
N, nrows), domain = NA)
if (is.atomic(value)) 
names(value

[R] Corrupt data frame construction - bug?

2009-04-29 Thread Steven McKinney
Hi useRs,

A recent coding infelicity along these lines
yielded a corrupt data frame.

foo <- matrix(1:12, nrow = 3)
bar <- data.frame(foo)
bar$NewCol <- foo[foo[, 1] == 4, 4]
bar
lapply(bar, length)




> foo <- matrix(1:12, nrow = 3)
> bar <- data.frame(foo)
> bar$NewCol <- foo[foo[, 1] == 4, 4]
> bar
  X1 X2 X3 X4 NewCol
1  1  4  7 10   
2  2  5  8 11   
3  3  6  9 12   
Warning message:
In format.data.frame(x, digits = digits, na.encode = FALSE) :
  corrupt data frame: columns will be truncated or padded with NAs
> lapply(bar, length)
$X1
[1] 3

$X2
[1] 3

$X3
[1] 3

$X4
[1] 3

$NewCol
[1] 0


Is this a bug in the data.frame machinery?
If an attempt is made to add a new column
to a data frame, and the new object does
not have length = number of rows of data frame,
or cannot be made to have such length via recycling,
shouldn't an error be thrown?

Instead in this example I end up with a
"corrupt data frame" having one zero-length column.


Should this be reported as a bug, or did I misinterpret
the documentation?




> sessionInfo()
R version 2.9.0 (2009-04-17) 
powerpc-apple-darwin8.11.1 

locale:
en_CA.UTF-8/en_CA.UTF-8/C/C/en_CA.UTF-8/en_CA.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 

other attached packages:
[1] nlme_3.1-90

loaded via a namespace (and not attached):
[1] grid_2.9.0  lattice_0.17-22 tools_2.9.0    
> 


Also occurs on Windows box with R 2.8.1



Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: smckinney +at+ bccrc +dot+ ca

tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3
Canada

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] create objects in a loop and adding sqlQuery content to them

2009-04-20 Thread Steven McKinney
> 
> >
> > #Here my first try [error message on the line within the loop,
saying
> > something like:
> > # 'recursive indexing on level 2 failed']
> >   sites_object_list <- vector("list",99)
> 
> If there's a site number larger than 99 this will be problematic.

Sorry, this isn't correct - R will just increase the list size if a 
site number is bigger than 99.


Steve McKinney

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] create objects in a loop and adding sqlQuery content to them

2009-04-20 Thread Steven McKinney


 Hello


> -Original Message-
> From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org]
> On Behalf Of Katharina May
> Sent: Monday, April 20, 2009 5:05 PM
> To: r-help@r-project.org
> Subject: [R] create objects in a loop and adding sqlQuery content to
them
> 
> Hi there,
> 
> I've got a database or rather spreadsheet with several columns and
rows.
> For one column named sites I want to loop through all possible values
> and retrieve
> all data out of the database where site = x and write it into an
> object named 'sitex_data'.
> 
> Somehow I'm really missing something as I'm not able to create these
> sitex_data objects with
> the database values, neither using list nor assign...
> 
> Here some code snipplets:
>  >library (RODBC)
>  >channel <- odbcConnectExcel2007 ("biomass_data.xlsx")
>  >site_data  <-  sqlQuery(channel, "select site_no from  [biomass_data
> $] group by site_no")
> 
> #Here the values I want to loop through
>  >str(site_data)
> 'data.frame':   44 obs. of  1 variable:
>   $ site_no: num  4 7 9 10 15 16 17 18 19 20 ...

There's only 44 of them - can you show the whole thing? e.g.
>site_data

Also check the class of site_data$site_no
It might be of class character or factor and
not just numeric as it might initially appear.


> 
> #Here my first try [error message on the line within the loop, saying
> something like:
> # 'recursive indexing on level 2 failed']
>   sites_object_list <- vector("list",99)

If there's a site number larger than 99 this will be problematic.

>   for(i in site_data) {
> sites_object_list[[i]] <- sqlQuery(channel, paste("select * from
> [biomass_data$] where site_no = ",i,sep=""))
>   }
> 
> #Here my second try [error message  on the line within the loop,
> saying something like:
> # 'only the first element will be used as variable name']
> for(i in site_data) {
>   assign(paste("site",i,"_data",sep=""), sqlQuery(channel,
> paste("select * from  [biomass_data$] where site_no = ",i,sep="")))
> 
> }

I'm guessing one of the site numbers is not just a simple integer
but has two numbers with a space or something in it.

> 
> I would be very, very glad if anybody sends me a clue about this, as
> I'm a obviuos Newbie using
> R...
> 
> Thanks,
> 
>   Katharina
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.




 

 

Steven McKinney, Ph.D.

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: smckin...@bccrc.ca
tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3

Canada

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] what is R equivalent of Fortran DOUBLE PRECISION ?

2009-03-30 Thread Steven McKinney


>From the help page for "Foreign"
> ?Foreign


 Numeric vectors in R will be passed as type double * to C (and as double 
precision to Fortran)


Also, see the help page for "double"

> ?double

 R has no single precision data type. All real numbers are stored in double 
precision format. 
 The functions as.single and single are identical to as.double and double 
except they set 
 the attribute Csingle that is used in the .C and .Fortran interface, and they 
are intended 
 only to be used in that context.

HTH

Steven McKinney, Ph.D.

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: smckinney +at+ bccrc +dot+ ca

tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3
Canada




-Original Message-
From: r-help-boun...@r-project.org on behalf of mau...@alice.it
Sent: Mon 3/30/2009 3:07 AM
To: r-help@r-project.org
Cc: John C.  Nash
Subject: [R] what is R equivalent of Fortran DOUBLE PRECISION ?
 
I noticed taht R cannot understand certain Fortran real constant formats. For 
instance:

 c14<- as.double( 7.785205408500864D-02)
Error: unexpected symbol in " c14<- as.double( 7.785205408500864D"

The above "D" is used in Fortran language to indicate the memory starage mode. 
That is for instructing Fortran compiler 
to store such a REAL constant in  DOUBLE PRECISION... am I right ?
Since R cannot undestand numerical conatant post-fixed by the letter "D", I 
wonder how I can instruct R interpreter to 
store such a numerical constant reserving as muh memory as necessary so as to 
accommodate a double precision number.

I noticed R accepts the folllowing syntax but I do not know if i have achieved 
my goal thsi way:

>  c14<- as.double( 7.785205408500864E-02)
> typeof(c4)
[1] "double"

My questions are: what is the best precision I can get with R when dealing with 
real number ?
Is R "double" type equvalent to Fortran DOUBLE PRECISION  for internal number 
representation ?

Thank you very much.
Maura








tutti i telefonini TIM!


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Combining columns from two dataframes

2009-03-17 Thread Steven McKinney
Not sure how well this gets at what you want,
but it seems close (if not kludgy!)

> c <- data.frame(t(cbind(t(a), t(b))), stringsAsFactors = FALSE)[, 1, drop = 
> FALSE]
> c
   a
1 2008-07-27
2 2008-10-01
3 2008-08-15
4 2008-08-14
5 2008-08-14
6 2008-09-20
7 2008-07-27
8 2008-10-01
> c$a <- as.Date(c$a)
> c
   a
1 2008-07-27
2 2008-10-01
3 2008-08-15
4 2008-08-14
5 2008-08-14
6 2008-09-20
7 2008-07-27
8 2008-10-01
> lapply(c, class)
$a
[1] "Date"

> 


Steven McKinney, Ph.D.

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: smckinney +at+ bccrc +dot+ ca

tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3
Canada




-Original Message-
From: r-help-boun...@r-project.org on behalf of Andrew McFadden
Sent: Tue 3/17/2009 2:55 PM
To: r-help@r-project.org
Subject: [R] Combining columns from two dataframes
 
I all

I am trying to combine columns from two dataframes to make a completely
new dataframe consisting of one column of dates (ie a combination of
dates from the two dataframes).

>From the following dataframes

  a b
1 2008-07-27 1
2 2008-10-01 2
3 2008-08-15 3
4 2008-08-14 4
5 2008-08-14 5
6 2008-09-20 6

   c d
1 2008-07-27 1
2 2008-10-01 2

I would like to get:

  z
 2008-07-27
 2008-10-01
 2008-08-15
 2008-08-14
 2008-08-14
 2008-09-20
 2008-07-27
 2008-10-01

a<-
data.frame(as.Date(c("2008-07-27","2008-10-01","2008-08-15","2008-08-14"
,
"2008-08-14","2008-09-20"), format = "%Y-%m-%d"),c(1:6))
names(a)=c("a","b")
a
b<- data.frame(as.Date(c("2008-07-27","2008-10-01"), format =
"%Y-%m-%d"),c(1:2))
names(b)=c("c","d")
a;b

I have tried:

append(a,b)
combine(a,b)
rbind(a,b)

I would appreciate your help.

Kind regards

andy

Andrew McFadden MVS BVSc
Incursion Investigator
Investigation & Diagnostic Centres - Wallaceville Biosecurity New 
Zealand Ministry of Agriculture and Forestry

Phone 04 894 5600 Fax 04 894 4973 Mobile 029 894 5611 Postal address: 
Investigation and Diagnostic Centre- Wallaceville Box 40742 Ward St 
Upper Hutt



This email message and any attachment(s) is intended solely for the
addressee(s) named above. The information it contains is confidential
and may be legally privileged.  Unauthorised use of the message, or the
information it contains, may be unlawful. If you have received this
message by mistake please call the sender immediately on 64 4 8940100
or notify us by return email and erase the original message and
attachments. Thank you.

The Ministry of Agriculture and Forestry accepts no responsibility for
changes made to this email or to any attachments after transmission from
the office.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How do I set the Windows temporary directory in R?

2009-03-17 Thread Steven McKinney

Check help for tempfile() and tempdir()

You can set an environment variable
as discussed in the help
to point to D:\temp

>From ?tempfile:
 
 By default, the filenames will be 
 in the directory given by tempdir(). 
 This will be a subdirectory of the 
 temporary directory found by the 
 following rule. The environment variables 
 TMPDIR, TMP and TEMP are checked in turn 
 and the first found which points to a 
 writable directory is used: if none 
 succeeds ‘/tmp’ is used.

HTH

Steven McKinney, Ph.D.

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: smckinney +at+ bccrc +dot+ ca

tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3
Canada




-Original Message-
From: r-help-boun...@r-project.org on behalf of Jonathan Greenberg
Sent: Tue 3/17/2009 6:42 PM
To: r-help@r-project.org
Subject: [R] How do I set the Windows temporary directory in R?
 
I'm trying to redirect where temporary files go under R to 
D:\temp\somerandomname from its default (C:\Documents and 
Settings\username\Temp\somerandomname) -- how do I go about doing this?

--j

-- 

Jonathan A. Greenberg, PhD
Postdoctoral Scholar
Center for Spatial Technologies and Remote Sensing (CSTARS)
University of California, Davis
One Shields Avenue
The Barn, Room 250N
Davis, CA 95616
Cell: 415-794-5043
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is it possible for R to import a SigmaPlot file?

2009-01-22 Thread Steven McKinney
Hi Jason

Have you checked out Stat/Transfer?
www.stattransfer.com

They state they handle SYSTAT files, and
SigmaPlot is a SYSTAT product.

Stat/Transfer has a demo download you
could use to test.

I've used it to good effect (though for
other than SYSTAT datasets) in the past.


HTH


Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: smckinney +at+ bccrc +dot+ ca

tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3
Canada




-Original Message-
From: r-help-boun...@r-project.org on behalf of Jason Rupert
Sent: Thu 1/22/2009 2:58 PM
To: r-help@r-project.org
Subject: [R] Is it possible for R to import a SigmaPlot file?
 
I recently received a Sigmaplot file (*.jnb) from a customer and would like to 
know if I can input it to a data frame and then manipulate the data in R.  
 
I did a search on Google and on RSeek (www.rseek.org), but did not get any good 
hits.  Thank for any feedback and insight you can provide. 
 
P.S. Love the flexibility of R and would love to keep using it.   Just wanting 
to know if this is possible.   Thanks again.


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with subset() function?

2009-01-20 Thread Steven McKinney

D'oh!  My apologies for the noise.

I thought I had verified class
from the str() output the user was 
showing me.  

> class(subset(mydf, ht >= 150.0 & wt <= 150.0, select = c(age)))
[1] "data.frame"
> class(subset(mydf, ht >= 150.0 & wt <= 150.0, select = c(age), drop = TRUE))
[1] "integer"
> class(mydf[mydf$ht >= 150.0 & mydf$wt <= 150.0, "age"])
[1] "integer"
> density(subset(mydf, ht >= 150.0 & wt <= 150.0, select = c(age), drop = TRUE))

Call:
density.default(x = subset(mydf, ht >= 150 & wt <= 150, select = 
c(age), drop = TRUE))

Data: subset(mydf, ht >= 150 & wt <= 150, select = c(age), drop = TRUE) (29 
obs.);  Bandwidth 'bw' = 5.816

   xy
 Min.   : 4.553   Min.   :3.781e-05  
 1st Qu.:22.776   1st Qu.:3.108e-03  
 Median :41.000   Median :1.775e-02  
 Mean   :41.000   Mean   :1.370e-02  
 3rd Qu.:59.224   3rd Qu.:2.128e-02  
 Max.   :77.447   Max.   :2.665e-02  
> 



It's the "drop" arg that differs between
 density(subset(mydf, ht >= 150.0 & wt <= 150.0, select = c(age)))
and
 density(mydf[mydf$ht >= 150.0 & mydf$wt <= 150.0, "age"])

so it is
 subset(mydf, ht >= 150.0 & wt <= 150.0, select = c(age), drop = TRUE)
that is equivalent to
 mydf[mydf$ht >= 150.0 & mydf$wt <= 150.0, "age"]


Apologies and thanks for setting me straight.


Best

Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: smckinney +at+ bccrc +dot+ ca

tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3
Canada




-Original Message-----
From: Marc Schwartz [mailto:marc_schwa...@comcast.net]
Sent: Tue 1/20/2009 3:20 PM
To: Steven McKinney
Cc: R-help@r-project.org
Subject: Re: [R] Problem with subset() function?
 
on 01/20/2009 05:02 PM Steven McKinney wrote:
> Hi all,
> 
> Can anyone explain why the following use of
> the subset() function produces a different
> outcome than the use of the "[" extractor?
> 
> The subset() function as used in
> 
>  density(subset(mydf, ht >= 150.0 & wt <= 150.0, select = c(age)))

Here you are asking density to be run on a data frame, which is what
subset returns, even when you select a single column. Thus, you get an
error since density() expects a numeric vector.

No bug in either subset() or the documentation.

You could do this:

  density(subset(mydf, ht >= 150.0 & wt <= 150.0, select = age)[[1]])


> appears to me from documentation to be equivalent to
> 
>  density(mydf[mydf$ht >= 150.0 & mydf$wt <= 150.0, "age"])

Here you are running density on a vector, so it works. This is because
the default behavior for "[.data.frame" has 'drop = TRUE', which means
that the returned result is coerced to the lowest possible dimension.
Thus, rather than a single data frame column, a vector is returned.

The result from subset() would be equivalent to using 'drop = FALSE'.

HTH,

Marc Schwartz


> (modulo exclusion of NAs) but use of the former yields an 
> error from density.default() (shown below).
> 
> 
> Is this a bug in the subset() machinery?  Or is it
> a documentation issue for the subset() function
> documentation or density() documentation?
> 
> I'm seeing issues such as this with newcomers to R
> who initially seem to prefer using subset() instead
> of the bracket extractor.  At this point these functions
> are clearly not exchangeable.  Should code be patched
> so that they are, or documentation amended to show
> when use of subset() is not appropriate?
> 
>> ### Bug in subset()?
> 
>> set.seed(123)
>> mydf <- data.frame(ht = 150 + 10 * rnorm(100),
> +wt = 150 + 10 * rnorm(100),
> +age = sample(20:60, size = 100, replace = TRUE)
> +)
> 
> 
>> density(subset(mydf, ht >= 150.0 & wt <= 150.0, select = c(age)))
> Error in density.default(subset(mydf, ht >= 150 & wt <= 150, select = 
> c(age))) : 
>   argument 'x' must be numeric
> 
> 
>> density(mydf[mydf$ht >= 150.0 & mydf$wt <= 150.0, "age"])
> 
> Call:
>   density.default(x = mydf[mydf$ht >= 150 & mydf$wt <= 150, "age"])
> 
> Data: mydf[mydf$ht >= 150 & mydf$wt <= 150, "age"] (29 obs.); Bandwidth 'bw' 
> = 5.816
> 
>xy
>  Min.   : 4.553   Min.   :3.781e-05  
>  1st Qu.:22.776   1st Qu.:3.108e-03  
>  Median :41.000   Median :1.775e-02  
>  Mean   :41.000   Mean   :1.370e-02  
>  3rd Qu.:59.224   3rd Qu.:2.128e-02  
>  Max.   :77.447   Max.   :2.665e-02  
> 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem with subset() function?

2009-01-20 Thread Steven McKinney
Hi all,

Can anyone explain why the following use of
the subset() function produces a different
outcome than the use of the "[" extractor?

The subset() function as used in

 density(subset(mydf, ht >= 150.0 & wt <= 150.0, select = c(age)))

appears to me from documentation to be equivalent to

 density(mydf[mydf$ht >= 150.0 & mydf$wt <= 150.0, "age"])

(modulo exclusion of NAs) but use of the former yields an 
error from density.default() (shown below).


Is this a bug in the subset() machinery?  Or is it
a documentation issue for the subset() function
documentation or density() documentation?

I'm seeing issues such as this with newcomers to R
who initially seem to prefer using subset() instead
of the bracket extractor.  At this point these functions
are clearly not exchangeable.  Should code be patched
so that they are, or documentation amended to show
when use of subset() is not appropriate?

> ### Bug in subset()?

> set.seed(123)
> mydf <- data.frame(ht = 150 + 10 * rnorm(100),
+wt = 150 + 10 * rnorm(100),
+age = sample(20:60, size = 100, replace = TRUE)
+)


> density(subset(mydf, ht >= 150.0 & wt <= 150.0, select = c(age)))
Error in density.default(subset(mydf, ht >= 150 & wt <= 150, select = c(age))) 
: 
  argument 'x' must be numeric


> density(mydf[mydf$ht >= 150.0 & mydf$wt <= 150.0, "age"])

Call:
density.default(x = mydf[mydf$ht >= 150 & mydf$wt <= 150, "age"])

Data: mydf[mydf$ht >= 150 & mydf$wt <= 150, "age"] (29 obs.);   Bandwidth 'bw' 
= 5.816

   xy
 Min.   : 4.553   Min.   :3.781e-05  
 1st Qu.:22.776   1st Qu.:3.108e-03  
 Median :41.000   Median :1.775e-02  
 Mean   :41.000   Mean   :1.370e-02  
 3rd Qu.:59.224   3rd Qu.:2.128e-02  
 Max.   :77.447   Max.   :2.665e-02  


> sessionInfo()
R version 2.8.0 Patched (2008-11-06 r46845) 
powerpc-apple-darwin9.5.0 

locale:
C

attached base packages:
[1] stats graphics  grDevices datasets  utils methods   base     

loaded via a namespace (and not attached):
[1] Matrix_0.999375-16 grid_2.8.0 lattice_0.17-15lme4_0.99875-9
[5] nlme_3.1-89   
> 






Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: smckinney +at+ bccrc +dot+ ca

tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3
Canada

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Compare matrices

2009-01-19 Thread Steven McKinney

Use the is.na() function to assign
NA values:


> is.na(A) <- !B
> A
 [,1] [,2] [,3]
[1,]3   NA   NA
[2,]333
[3,]33   NA
> C <- matrix(c(3,3,3,NA,3,3,NA,3,NA),3,3)
> all.equal(A, C)
[1] TRUE





Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: smckinney +at+ bccrc +dot+ ca

tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3
Canada




-Original Message-
From: r-help-boun...@r-project.org on behalf of Dimitris Rizopoulos
Sent: Mon 1/19/2009 12:54 PM
To: Andrej Kastrin
Cc: r-help@r-project.org
Subject: Re: [R] Compare matrices
 
try this:

A <- matrix(c(3,3,3,3,3,3,3,3,3),3,3)
B <- matrix(c(T,T,T,F,T,T,F,T,F),3,3)

C <- A
C[!B] <- NA
C


I hope it helps.

Best,
Dimitris


Andrej Kastrin wrote:
> Dear all,
> 
> Suppose that I have a matrix A
> 
>A <- matrix(c(3,3,3,3,3,3,3,3,3),3,3)
> 
> and a logical matrix B
> 
>B <- matrix(c(T,T,T,F,T,T,F,T,F),3,3)
> 
> The result  matrix should be
> 
>C <- matrix(c(3,3,3,NA,3,3,NA,3,NA),3,3)
> 
> Is there any simple tip or trick to perform this without looping?
> 
> Thanks in advance for any suggestion.
> 
> Best regards, Andrej
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

-- 
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] coercing a list into matrix

2009-01-14 Thread Steven McKinney

Hi Ken,

If your lists are not too large something such as this can work:

> matrixFromList <- function(listX) t(sapply(listX, function(x, n) c(x, rep(NA, 
> n))[1:n], n = max(sapply(listX, length

> matrixFromList(list(c(3,2,3),c(6,5)))
 [,1] [,2] [,3]
[1,]323
[2,]65   NA

> A <- list(c(3,2,3), c(1,2,3,4), c(5,6))

> matrixFromList(A)
 [,1] [,2] [,3] [,4]
[1,]323   NA
[2,]1234
[3,]56   NA   NA

> class(matrixFromList(A))
[1] "matrix"



Best

Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: smckinney +at+ bccrc +dot+ ca

tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3
Canada




-Original Message-
From: r-help-boun...@r-project.org on behalf of Lo, Ken
Sent: Wed 1/14/2009 1:50 PM
To: r-help@r-project.org
Subject: [R] coercing a list into matrix
 
Dear list,

I have a list of number sequences.  Each number sequence has different
numbers of elements.  Is there a quick way (other than to iterate
through the entire list) way to coerce list to matrix with NAs filling
in the short sequences?

An example of what I mean is this:

A <- list(c(3,2,3),c(6,5))

I'd like to get A so that it is

3 2 3
6 5 NA


Best,

Ken

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Memory Efficiency of Symmetric Matrix

2009-01-06 Thread Steven McKinney
See also the dist() function documentation.
If you use indexing as described in ?dist
it is straightforward to maintain and
use a vector of the distances.

Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: smckinney +at+ bccrc +dot+ ca

tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3
Canada




-Original Message-
From: r-help-boun...@r-project.org on behalf of Søren Højsgaard
Sent: Tue 1/6/2009 4:36 PM
To: Nathan S. Watson-Haigh; r-help@r-project.org
Subject: Re: [R] Memory Efficiency of Symmetric Matrix
 
You can do
mat[lower.tri(mat, diag=F)]
Søren

 


Fra: r-help-boun...@r-project.org på vegne af Nathan S. Watson-Haigh
Sendt: on 07-01-2009 01:28
Til: r-help@r-project.org
Emne: [R] Memory Efficiency of Symmetric Matrix



-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

I'm generating a symmetric correlation matrix using a data matrix as input:
mat <- cor(data.mat)

My question is:
Is there a more memory efficient way to store this data? For instance, since:
all(mat == t(mat))
every value is duplicated, and I should be able to almost half the memory usage 
for large matrices.

Any thoughts/comments?

Cheers,
Nathan


- --
- 
Dr. Nathan S. Watson-Haigh
OCE Post Doctoral Fellow
CSIRO Livestock Industries
Queensland Bioscience Precinct
St Lucia, QLD 4067
Australia

Tel: +61 (0)7 3214 2922
Fax: +61 (0)7 3214 2900
Web: http://www.csiro.au/people/Nathan.Watson-Haigh.html
- 

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org 
<http://enigmail.mozdev.org/> 

iEYEARECAAYFAklj9yAACgkQ9gTv6QYzVL6MGQCg1CHsRGAwEMah/8ZuZ9QFI6O5
lcIAnjZ68DE9FABLMd07A3AfdMPRpXIH
=5bet
-END PGP SIGNATURE-

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems installing lme4 on Ubuntu

2008-12-19 Thread Steven McKinney


This may just be version incompatibilities.

A similar discussion recently transpired on
r-sig-mixed-models, see e.g. 

https://stat.ethz.ch/pipermail/r-sig-mixed-models/2008q4/001526.html


HTH


Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: smckinney +at+ bccrc +dot+ ca

tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3
Canada




-Original Message-
From: r-help-boun...@r-project.org on behalf of Bill Harris
Sent: Fri 12/19/2008 5:13 PM
To: r-help@r-project.org
Subject: [R] Problems installing lme4 on Ubuntu
 
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

While I'm not an R expert, I have used R on Windows XP.  Now I've moved
to Ubuntu (Intrepid), and I'm trying to configure R to work with the
Gelman and Hill _Data Analysis Using Regression and
Multilevel/Hierarchical Models_.  So far, it's not working.

I start by following the instructions for installing arm and BRugs at
<http://www.stat.columbia.edu/~gelman/bugsR/>.  At first, it kept
failing, but I kept installing more Ubuntu packages.  Finally, I
installed the science-statistics package, and things seemed to work: I
got the required packages to show up as loaded when I run library().

When I run library(arm) (FWIW, I'm using ESS in Emacs), I get

,
| > library(arm)
| Loading required package: MASS
| Loading required package: Matrix
| Loading required package: lattice
| 
| Attaching package: 'Matrix'
| 
| 
|   The following object(s) are masked from package:stats :
| 
|xtabs 
| 
| Loading required package: lme4
| Error in dyn.load(file, DLLpath = DLLpath, ...) : 
|   function 'cholmod_start' not provided by package 'Matrix'
| Error: package 'lme4' could not be loaded
`

even though

,
| > library()
| Packages in library '/home/bill/R/i486-pc-linux-gnu-library/2.7':
| 
| arm Data Analysis Using Regression and
| Multilevel/Hierarchical Models
| codaOutput analysis and diagnostics for MCMC
| Matrix  Sparse and Dense Matrix Classes and Methods
| R2WinBUGS   Running WinBUGS and OpenBUGS from R / S-PLUS
| 
| Packages in library '/usr/lib/R/site-library':
|   .
|   .
| latticeExtraExtra Graphical Utilities Based on Lattice
| lme4Linear mixed-effects models using S4 classes
| lmtest  Testing Linear Regression Models
|   .
|   .
|
| Packages in library '/usr/lib/R/library':
| 
|   .
|   .
|   .
| 
| Warning message:
| In library() : library '/usr/local/lib/R/site-library' contains no packages
`

and 

,
| > library(help = arm)
| 
|   Information on package 'arm'
| 
| Description:
| 
| Package:   arm
| Version:   1.1-17
| Date:  2008-11-25
| Title: Data Analysis Using Regression and
|Multilevel/Hierarchical Models
| Author:Andrew Gelman , Yu-Sung Su
|, Masanao Yajima
|, Jennifer Hill
|, Maria Grazia Pittau
|, Jouni Kerman
| and Tian Zheng
|
| Maintainer:Yu-Sung Su 
| Depends:   methods, R (>= 2.6.0), MASS, Matrix (>= 0.999375-10),
|lme4 (>= 0.999375-16), R2WinBUGS
| Suggests:  car, foreign
| Description:   R functions for processing lm, glm, mer and polr
|outputs.
| URL:   http://www.stat.columbia.edu/~gelman/software/
| License:   GPL (>= 2)
| Packaged:  Tue Nov 25 15:26:57 2008; SUYS
| Built: R 2.7.1; ; 2008-12-17 21:31:22; unix
| 
| Index:
| 
| GO-classFunction to Recall Last Source File
| balance Functions to compute the balance statistics
| bayesglmBayesian generalized linear models.
| bayespolr   Bayesian Ordered Logistic or Probit Regression
| binnedplot  Binned Residual Plot
| coefplotGeneric Function for Making Coefficient Plot
| contr.bayes.ordered Contrast Matrices
| corrplotCorrelation Plot
| display Functions for Processing lm, glm, mer and polr
| Output
| fround  Formating the Rounding of Numbers
| invlogitInverse logistic function
| lalonde Lalonde Dataset
| matchingSingle Nearest Neighborhood Matching
| model.matrix.bayes  Construct Design Matrices
| multicomp.plot  Multiple Comparison Plot
| rescale Function for Standardizing by Centering and
| Dividing by 2 sd's
| residual.plot   residual plot for the observed values
| se.coef Extract Standard Errors of Model Coefficients
| sigma.

Re: [R] How can I draw bars

2008-12-09 Thread Steven McKinney
Hi Raymond,

You might be able to get the plots you want
with the 'cytoband' material provided in
the Bioconductor package "SNPchip".

Check out the SNPchip data set 'cytoband'
and the plot function "plotCytoband"
for ideas.  This material is used e.g.
in Bioconductor package "VanillaICE"
to add a chromosome graphic to the
bottom of data plots of genomic data.
You can see examples in the vignette
PDFs.

 
HTH
 

Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: [EMAIL PROTECTED]
tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3

Canada


 

 


> -Original Message-
> From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]
> On Behalf Of stephen sefick
> Sent: Tuesday, December 09, 2008 7:48 PM
> To: Raymond Balise
> Cc: r-help@r-project.org
> Subject: Re: [R] How can I draw bars
> 
> arrows and I don't remember what package, but I will see if I can
> remember.
> 
> On Tue, Dec 9, 2008 at 3:02 PM, Raymond Balise
<[EMAIL PROTECTED]>
> wrote:
> > I need to make a graphic to show problems on different parts of
> > chromosomes (think of a graphic showing the number of frayed threads
> > as colors along different parts of a worn out rope).  I want to draw
> > bars going from left to right across a page and color different
parts
> > of the bars in different shades.  Each graphic will need to have
> > several bars of different lenghts corresponding to the different
> > lenghts of the chromsomes.
> >
> > Is there a package around that can help me draw custom bars of
> > different colors?  I am a hardcore SAS guy who is just learning R so
I
> > am mostly clueless.
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> 
> 
> 
> --
> Stephen Sefick
> 
> Let's not spend our time and resources thinking about things that are
> so little or so large that all they really do for us is puff us up and
> make us feel like gods.  We are mammals, and have not exhausted the
> annoying little problems of being mammals.
> 
>   -K.
Mullis
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with reading code

2008-12-03 Thread Steven McKinney
Hi Dana

> -Original Message-
> From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]
> On Behalf Of Dana77
> Sent: Wednesday, December 03, 2008 3:24 PM
> To: r-help@r-project.org
> Subject: [R] Help with reading code
> 
> 
> I would like to give out the equation for calculating the maximum
> likelihood.
> Below is the code, but I still have problems with it.  After I read
the
> code, I found there are two cases for "w(weights)".  If  "w" is not
zero,
> then the equation is given as "val <- 0.5 * (sum(log(w)) - N * (log(2
*
> pi)
> + 1 - log(N) +
> log(sum(w * res^2". However, if "w" is zero, then I do not
> know
> what equation it should be since it does not make any sense for
"log0".
> Hope
> someone can help me to figure this out. Thanks!
> 
> 
> 
> 
> function (object, REML = FALSE, ...)
> {
> res <- object$residuals
> p <- object$rank
> N <- length(res)
> if (is.null(w <- object$weights)) {
> w <- rep.int(1, N)
> }
> else {
> excl <- w == 0  #I can not understand the following lines
after this.

The command above sets a variable to "exclude" with.
Any observation with weight equal to zero will get excluded.
excl will have value TRUE for all observations with weight 0.

> if (any(excl)) { ### If there are any observations to exclude
> res <- res[!excl]   ### then keep the ones with weight not
zero
> N <- length(res)
> w <- w[!excl]}

so the above chunk keeps only the residuals and weights
for the observations with non-zero weights

> }
> N0 <- N
> if (REML)
> N <- N - p
> val <- 0.5 * (sum(log(w)) - N * (log(2 * pi) + 1 - log(N) +
> log(sum(w * res^2

so now there are no more observations with w == 0 in the above equation

> if (REML)
> val <- val - sum(log(abs(diag(object$qr$qr)[1:p])))
> attr(val, "nall") <- N0
> attr(val, "nobs") <- N
> attr(val, "df") <- p + 1
> class(val) <- "logLik"
> val
> }
> 
> 
> Best,
> 
> Dana
> --
> View this message in context: http://www.nabble.com/Help-with-reading-
> code-tp20823979p20823979.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.



 

Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: [EMAIL PROTECTED]
tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3

Canada

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] AIC function and Step function

2008-11-27 Thread Steven McKinney
Hi Dana,

Many thanks to Christos Hatzis who sent
me an offline response, pointing out the
new functions that make this much
easier than my last suggestions:
methods() and getAnywhere()

> methods("extractAIC")
[1] extractAIC.aov* extractAIC.coxph*   extractAIC.glm* extractAIC.lm*  
extractAIC.negbin* 
[6] extractAIC.survreg*

   Non-visible functions are asterisked
> getAnywhere("extractAIC.coxph")
A single object matching ‘extractAIC.coxph’ was found
It was found in the following places
  registered S3 method for extractAIC from namespace stats
  namespace:stats
with value

function (fit, scale, k = 2, ...) 
{
edf <- length(fit$coef)
loglik <- fit$loglik[length(fit$loglik)]
c(edf, -2 * loglik + k * edf)
}

> 

Thank you Christos.


That said, one of the advantages of getting
the source code is that it has comments that
are stripped out when the code is sourced into R

e.g. from the command line

> getAnywhere(AIC.default)
A single object matching ‘AIC.default’ was found
It was found in the following places
  registered S3 method for AIC from namespace stats
  namespace:stats
with value

function (object, ..., k = 2) 
{
ll <- if ("stats4" %in% loadedNamespaces()) 
stats4:::logLik
else logLik
if (length(list(...))) {
object <- list(object, ...)
val <- lapply(object, ll)
val <- as.data.frame(t(sapply(val, function(el) c(attr(el, 
"df"), AIC(el, k = k)
names(val) <- c("df", "AIC")
Call <- match.call()
Call$k <- NULL
row.names(val) <- as.character(Call[-1])
val
}
else AIC(ll(object), k = k)
}


>From the source file


#  File src/library/stats/R/AIC.R
#  Part of the R package, http://www.R-project.org
#
#  This program is free software; you can redistribute it and/or modify
#  it under the terms of the GNU General Public License as published by
#  the Free Software Foundation; either version 2 of the License, or
#  (at your option) any later version.
#
#  This program is distributed in the hope that it will be useful,
#  but WITHOUT ANY WARRANTY; without even the implied warranty of
#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#  GNU General Public License for more details.
#
#  A copy of the GNU General Public License is available at
#  http://www.r-project.org/Licenses/

 Return the object's value of the Akaike Information Criterion
 (or "An Inf.. Crit..")

AIC <- function(object, ..., k = 2) UseMethod("AIC")

## AIC for logLik objects
AIC.logLik <- function(object, ..., k = 2)
-2 * c(object) + k * attr(object, "df")

AIC.default <- function(object, ..., k = 2)
{
## AIC for various fitted objects --- any for which there's a logLik() 
method:
ll <- if("stats4" %in% loadedNamespaces()) stats4:::logLik else logLik
if(length(list(...))) {# several objects: produce data.frame
object <- list(object, ...)
val <- lapply(object, ll)
val <- as.data.frame(t(sapply(val,
  function(el)
  c(attr(el, "df"), AIC(el, k = k)
    names(val) <- c("df", "AIC")
Call <- match.call()
Call$k <- NULL
row.names(val) <- as.character(Call[-1])
val
} else AIC(ll(object), k = k)
}



Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: smckinney +at+ bccrc +dot+ ca

tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3
Canada

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] AIC function and Step function

2008-11-27 Thread Steven McKinney
ues
## for loglik
edf <- length(fit$coef)
loglik <- fit$loglik[length(fit$loglik)]
c(edf, -2 * loglik + k * edf)
}

extractAIC.survreg <- function(fit, scale, k = 2, ...)
{
edf <- sum(fit$df)
c(edf, -2 * fit$loglik[2] + k * edf)
}

extractAIC.glm <- function(fit, scale = 0, k = 2, ...)
{
n <- length(fit$residuals)
edf <- n  - fit$df.residual
    aic <- fit$aic
c(edf, aic + (k-2) * edf)
}

extractAIC.lm <- function(fit, scale = 0, k = 2, ...)
{
n <- length(fit$residuals)
edf <- n  - fit$df.residual
RSS <- deviance.lm(fit)
dev <- if(scale > 0) RSS/scale - n else n * log(RSS/n)
c(edf, dev + k * edf)
}
extractAIC.aov <- extractAIC.lm

extractAIC.negbin <- function(fit, scale, k = 2, ...)
{
n <- length(fit$residuals)
edf <- n - fit$df.residual
c(edf, -fit$twologlik + k * edf)
}


HTH



Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: smckinney +at+ bccrc +dot+ ca

tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3
Canada




-Original Message-
From: [EMAIL PROTECTED] on behalf of Dana77
Sent: Thu 11/27/2008 6:11 PM
To: r-help@r-project.org
Subject: [R]  AIC function and Step function
 

I would like to figure out the equations for calculating "AIC" in both
 "step() function" and "AIC () function". They are different. Then I  
just type "step" in the R console, and found the "AIC" used in "step()  
function" is "extractAIC". I went to the R help, and found:

"The criterion used is

AIC = - 2*log L + k * edf,
where L is the likelihood and edf the equivalent degrees of freedom (i.e.,
the number of free parameters for usual parametric models) of fit.

For linear models with unknown scale (i.e., for lm and aov), -2log L is
computed from the deviance and uses a different additive constant to logLik
and hence AIC. If RSS denotes the (weighted) residual sum of squares then
extractAIC uses for - 2log L the formulae RSS/s - n (corresponding to
Mallows' Cp) in the case of known scale s and n log (RSS/n) for unknown
scale. AIC only handles unknown scale and uses the formula n log (RSS/n) - n
+ n log 2? - sum log w where w are the weights."


Now, my question is what code I should use to look at the exact calculation
process in the AIC()function and extractAIC() function in R? Thanks!

Dana
-- 
View this message in context: 
http://www.nabble.com/AIC-function-and-Step-function-tp20728043p20728043.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 64bit R for Mac

2008-11-24 Thread Steven McKinney
Dear Marco

Check out

http://r.research.att.com/

and see if the 

R 2.8.0 Patched (2008-11-21)tiger
R-2.8-branch-n.dmg

works for you (n = 47002 at
this writing - changes every now and then).

I'm no longer running Tiger so haven't
tried this directly, but the Leopard
version is working well.

You might also want to follow the
r-sig-mac list for mac-specific
information.
 

 
HTH

Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: [EMAIL PROTECTED]
tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3

Canada


 

 


> -Original Message-
> From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]
> On Behalf Of Blanchette, Marco
> Sent: Monday, November 24, 2008 6:34 AM
> To: R-help
> Subject: [R] 64bit R for Mac
> 
> Dear R gurus,
> 
> On the CRAN website, it says that a 64bit version for Mac OS Tiger
would
> be release shortly. Do we know what are the expected dates? Will the
> packages be also compiled for 64bit?
> 
> We are running large microarray analysis and we keep hitting the 3Gb
> memory limit.
> 
> I saw that there is a version available on the development mirrors,
but I
> am not too excited to replace our very stable and reliable 32bit
version
> with a 64bit binary that might not be that stable and with packages
that
> would need to be 64bit compiled on site...
> 
> Cheers
> --
> Marco Blanchette, Ph.D.
> Assistant Investigator
> Stowers Institute for Medical Research
> 1000 East 50th St.
> 
> Kansas City, MO 64110
> 
> Tel: 816-926-4071
> Cell: 816-726-8419
> Fax: 816-926-2018
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 64bit R for Mac

2008-11-24 Thread Steven McKinney
(Apologies if this repost is a duplicate,
my first submission did not appear to
make it through.)

-Original Message-
From: Steven McKinney
Sent: Mon 11/24/2008 4:28 PM
To: 'Blanchette, Marco'; R-help
Subject: RE: [R] 64bit R for Mac
 
Dear Marco

Check out

http://r.research.att.com/

and see if the 

R 2.8.0 Patched (2008-11-21)tiger
R-2.8-branch-n.dmg

works for you (n = 47002 at
this writing - changes every now and then).

I'm no longer running Tiger so haven't
tried this directly, but the Leopard
version is working well.

You might also want to follow the
r-sig-mac list for mac-specific
information.
 

 
HTH

Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: [EMAIL PROTECTED]
tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3

Canada


 

 


> -Original Message-
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
> On Behalf Of Blanchette, Marco
> Sent: Monday, November 24, 2008 6:34 AM
> To: R-help
> Subject: [R] 64bit R for Mac
> 
> Dear R gurus,
> 
> On the CRAN website, it says that a 64bit version for Mac OS Tiger would
> be release shortly. Do we know what are the expected dates? Will the
> packages be also compiled for 64bit?
> 
> We are running large microarray analysis and we keep hitting the 3Gb
> memory limit.
> 
> I saw that there is a version available on the development mirrors, but I
> am not too excited to replace our very stable and reliable 32bit version
> with a 64bit binary that might not be that stable and with packages that
> would need to be 64bit compiled on site...
> 
> Cheers
> --
> Marco Blanchette, Ph.D.
> Assistant Investigator
> Stowers Institute for Medical Research
> 1000 East 50th St.
> 
> Kansas City, MO 64110
> 
> Tel: 816-926-4071
> Cell: 816-726-8419
> Fax: 816-926-2018
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to test for the empty set

2008-11-24 Thread Steven McKinney
Dear Jay,

length(x) will return zero
for all these (vector) examples,
and is quite readable in code.
(You might have seen code such as
if(length(x)) { blah }
which uses this idea.) 

Does this cover all your use
cases?


> x <- c()

> class(x)
[1] "NULL"

> length(x)
[1] 0

> y <- letters[1:3]
> z <- letters[4:6]
> x <- intersect(y,z)

> class(x)
[1] "character"

> length(x)
[1] 0

> y <- 1:3
> z <- 4:6
> x <- intersect(y,z)

> class(x)
[1] "integer"

> length(x)
[1] 0
> 


HTH

Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: smckinney +at+ bccrc +dot+ ca

tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3
Canada




-Original Message-
From: [EMAIL PROTECTED] on behalf of G. Jay Kerns
Sent: Mon 11/24/2008 9:41 AM
To: r-help@r-project.org
Subject: [R] how to test for the empty set
 
Dear R-help,

I first thought that the empty set (for a vector) would be NULL.

x <- c()
x

However, the documentation seems to make clear that there _many_ empty
sets depending on the vector's mode, namely, numeric(0), character(0),
logical(0), etc.  This is borne out by

y <- letters[1:3]
z <- letters[4:6]
intersect(y,z)

which, of course, is non-NULL:

is.null(character(0))   # FALSE

So, how can we test if a vector is, say, character(0)?  The following
doesn't (seem to) work:

x <- character(0)
x == character(0)  # logical(0)

More snooping led to the following:

wiki.r-project.org/rwiki/doku.php?id=tips:surprises:emptysetfuncs

and at the bottom of the page it says "logical(0) is an empty set,
thus is TRUE".  However, I get

isTRUE(logical(0))   # FALSE

but, on the other hand,

all.equal(x, character(0))  # TRUE

This would seem to be the solution, but am I missing something? and in
particular, is there an elegant way to check in the case that the mode
of the vector is not already known?

Thanks in advance for any insight you may have.

Best,
Jay




***
G. Jay Kerns, Ph.D.
Associate Professor
Department of Mathematics & Statistics
Youngstown State University
Youngstown, OH 44555-0002 USA
Office: 1035 Cushwa Hall
Phone: (330) 941-3310 Office (voice mail)
-3302 Department
-3170 FAX
E-mail: [EMAIL PROTECTED]
http://www.cc.ysu.edu/~gjkerns/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lmer p-values for fixed effects missing

2008-11-18 Thread Steven McKinney

> -Original Message-
> From: [EMAIL PROTECTED] on behalf of UKaraoz
> Sent: Tue 11/18/2008 2:16 PM
> To: r-help@r-project.org
> Subject: [R]  lmer p-values for fixed effects missing
>  
> 
> I am trying to replicate the repeated measures example from Dr.Faraway's book
> (Extending the linear model with R) as follows:
> 
> data(vision)
> vision$npower <- rep(1:4,14)
> 
> mmod <-lmer(acuity~power+(1|subject)+(1|subject:eye),vision)
> 
> 
> When I look at the fixed effects p-value, it is missing. Am I missing
> something here?
> Thanks.
> 
> Fixed effects:
> Estimate Std. Error t value
> (Intercept) 112.6429 2.2348   50.40
> power6/18 0.7857 1.54000.51
> power6/36-1. 1.5400   -0.65
> power6/60 3.2857 1.54002.13
> 



See the discussion titled "lmer, p-values and all that" at
https://stat.ethz.ch/pipermail/r-help/2006-May/094765.html
and a collection of related discussions at
http://wiki.r-project.org/rwiki/doku.php?id=guides:lmer-tests

In a nutshell:
The complexity of fully crossed mixed effects models
makes degrees of freedom and other model attributes
more difficult to calculate compared to fixed effects 
models.  

Research and debate is ongoing as to the best way to
handle model assessment, not all classical summary
statistics and associated p-values map to lmer models 
in a straightforward fashion.


Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: smckinney +at+ bccrc +dot+ ca

tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3
Canada



> 
> -- 
> View this message in context: 
> http://www.nabble.com/lmer-p-values-for-fixed-effects-missing-tp20569114p20569114.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lines.formula with subset

2008-11-13 Thread Steven McKinney



 


> -Original Message-
> From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]
> On Behalf Of John Field
> Sent: Thursday, November 13, 2008 4:41 PM
> To: r-help@r-project.org
> Subject: [R] lines.formula with subset
> 
> Dear list,
> 
> When I try to use lines.formula with subset and another argument I
> get an error.
> e.g.
> 
> x<-1:5
> y<-c(1,3,NA,2,5)
> plot(y~x, type="n")  # set up frame
> lines(y~x, subset=!is.na(y))  # works OK
> lines(y~x, type="o", col="blue")  # works OK
> # but
> lines(y~x, subset=!is.na(y), col="red")  # gives an error:
> 
> Error in if (length(x) == l) x[s] else x : argument is of length zero
> 
> Why does this happen?
> 

It happens because the function
graphics:::lines.formula

tries to assess the number of rows of data in the data frame
containing the variables in the formula y~x
(see the line of code 
l <- nrow(data)
in graphics:::lines.formula
This is the 'el' in the 'length(x) == l'
portion of the line you see in the error message)

Because you did not provide the data frame,
nrow(data) returns NULL, and thus the
if() clause is 'length(x) == NULL' which
yields answer logical(0), an invalid
answer in an if() clause.

Done this way, all is well:

mydf <- data.frame(x = 1:5, y = c(1,3,NA,2,5))
plot(y~x, type="n", data = mydf)  # set up frame
lines(y~x, subset=!is.na(y), data = mydf)  # works OK
lines(y~x, type="o", col="blue", data = mydf)  # works OK
lines(y~x, subset=!is.na(y), col="red", data = mydf)  # works OK

The formula - based functions expect to see a dataframe object
from the 'data' arg, but don't enforce this in this case. 

This may qualify as a logical bug in the graphics:::lines.formula
function.  An error should have been thrown before the if()
clause evaluation, but I'm not sure where in the chain of
function calls the check for a valid data object should be 
done and the error thrown. Otherwise, the data objects
y and x that you set up should have been passed downwards
in some fashion for evaluation.  R-core members who know 
the rules better than I will have to determine how best to 
handle this one.

HTH 

Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: [EMAIL PROTECTED]
tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3

Canada



> With thanks,
> John
> 
> =
> John Field Consulting Pty Ltd
> 10 High Street, Burnside SA 5066
> Phone 08 8332 5294 or 0409 097 586
> Fax   08 8332 1229
> Email  [EMAIL PROTECTED]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] bizarre axes with xyplot, problem in data?

2008-10-30 Thread Steven McKinney
> If anyone is bored I am still interested in why R
> orders lattice plots from bottom to top.

The idea was that, just as small values of
x are typically found to the left end of the X axis
and small values of y are typically found at the
bottom end of the Y axis; a trellis plot corresponding
to a small value of the conditioning variable should 
appear to the lower left of the trellis plot set,
and a trellis plot corresponding to a large value of 
the conditioning variable should be found to the upper 
right of the trellis plot set.  

Sounds like wise graphical layout policy, but is 
surprisingly confusing in many cases.  Thankfully
Deepayan Sarkar has included options to allow
alternate layouts.

(Reference:
Becker, Cleveland, Shyu, Kaluzny
http://www.research.att.com/areas/stat/doc/95.12.color.ps
Section 2.6 Layout
"Remember that trellis displays are filled as graphs,
from the origin in the lower left, not top-down as in
a table." )



Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: smckinney +at+ bccrc +dot+ ca

tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3
Canada




-Original Message-
From: [EMAIL PROTECTED] on behalf of Phillip Porter
Sent: Thu 10/30/2008 11:43 AM
To: r-help@r-project.org
Subject: Re: [R] bizarre axes with xyplot, problem in data?
 
My SAS brain was still plugged in.  I had a missing data point entered as a
"."  I didn't think anything of it since SAS treats that as missing data.
Apparently it confuses R.  I deleted the "." re-imported the data and
everything was beautiful.  If anyone is bored I am still interested in why R
orders lattice plots from bottom to top.

Thanks,
Phillip

On Thu, Oct 30, 2008 at 1:25 PM, Phillip Porter <[EMAIL PROTECTED]>wrote:

> Good Morning,
> I am using xyplot to show two variables for each of several subjects as
> follows:
> xyplot(y~x|as.factor(ID), type="b", layout=c(7,9),
> scales=list(x=list(tick.number=3), y=list(tick.number=5)))
>
> This is almost exactly the code I used for an earlier project, the only
> change is the number of ticks, but I'm getting all kinds of craziness on my
> Y axis.  I played around with everything, and by only using the first 100 or
> so subjects I can get everything to work out right, but when I run the whole
> dataset my Y axis goes crazy.  When I do 1,1 in layout and stretch the
> window the full height of my screen it looks like it is starting at 100 and
> going up to the top value of my Y data, and then continuing on the Y values
> lower than 100, but everything is overlapped so I'm not quite sure.
>
> Is the problem in my data? (read.csv from an excel file) Is it in the range
> of my data?  Is it some little detail I'm leaving out of my code?
>
> Thanks,
> Phillip
>
> P.S.  This isn't very important, but I am curious and maybe one of you
> knows, why does R start from the bottom and go up when doing the lattice
> plots?  My first subjects are at the bottom of the page, and move higher as
> you move up the page.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Computational problems in R

2008-10-28 Thread Steven McKinney

I misinterpreted the suggestion from Duncan
Murdoch, and I think Amy did too, judging
from her comment about B possibly being negative.

Duncan Murdoch kindly explained the idea
off-list, many thanks to him for taking the time.
To avoid further confusion, I've set up a basic
example explaining the suggestion.

Suppose this is the sum 
sum(exp(c_i/d)) i = 1, ..., n
that needs evaluating:

exp(-71) + exp(12) + exp(842) + exp(2100) + exp(141) + exp(-710)

The maximum term in the sum is exp(2100) 
so in the notation of the identity
A+B = A*(1 + B/A) = exp(log(A) + log(1 + B/A))

A = exp(2100)
B = exp(-71) + exp(12) + exp(842) + exp(141) + exp(-710)

so sum(exp(c_i/d)) i = 1, ..., n is
exp(2100) * ( 1 + ((exp(-71) + exp(12) + exp(842) + exp(141) + 
exp(-710))/exp(2100)) )
= exp(2100) * ( 1 + exp(-71 - 2100) + exp(12 - 2100) + exp(842 - 2100) + 
exp(141 - 2100) + exp(-710 - 2100) )

and of course all the terms 
exp(-71 - 2100) 
exp(12 - 2100) 
exp(842 - 2100) 
exp(141 - 2100) 
exp(-710 - 2100)
are very close to zero, so the sum
sum(exp(c_i/d)) i = 1, ..., n is 
exp(2100)
or 2100 on the log scale.

The terms in B are of course never negative as they
are all exponential quantities greater than zero,
so that's not an issue in this calculation (this
'B' not being the same as the 'B' in Amy's original
description).

Thanks again to Duncan Murdoch, and my apologies
for my misinterpretation.


Steven McKinney


-Original Message-
From: Robin Hankin [mailto:[EMAIL PROTECTED]
Sent: Tue 10/28/2008 12:58 AM
To: A.Noufaily
Cc: Duncan Murdoch; r-help@r-project.org; [EMAIL PROTECTED]; Steven McKinney; 
Xiaoxu LI
Subject: Re: [R] Computational problems in R
 
Hello.

The Brobdingnag package uses that identity for a logarithmic
representation and also has a hack for negative numbers.


HTH

rksh



A.Noufaily wrote:
>  
> Many thanks for your suggestions...
>
> I am still checking which one is the most useful for my simulations.
>
> Concerning using logs, this might be very helpful, but I am not sure if
> I can use the following:
> A+B = A*(1 + B/A)
> = exp(log(A) + log(1 + B/A))
> because unfortunately B can be negative.
> However, I might still use logs in case (1 + B/A)>0.
>
> Regards,
>  
> Amy
>
> -Original Message-
> From: Duncan Murdoch [mailto:[EMAIL PROTECTED] 
> Sent: Saturday, October 25, 2008 11:36 AM
> To: Steven McKinney
> Cc: A.Noufaily; r-help@r-project.org
> Subject: Re: [R] Computational problems in R
>
> On 24/10/2008 9:50 PM, Steven McKinney wrote:
>   
>> I suspect there's a deeper issue here.
>> sum(exp(yi)) when large yi occur is
>> problematic.  exp(yi) for yi>710 is
>> just a huge number, and summing additional values only makes the 
>> overall sum larger as all components of the summation are positive.
>> There's no way around that.
>> 
>
> Sure there is, and you quoted it below.  Work on a log scale.  The log
> of exp(yi) is yi, and it sounds as though the yi values are manageable.
>
> You might end up knowing that the log of the final answer is 2 and
> not be able to evaluate exp(2) in R, but you still know that the
> answer is exp(2).
>
> Duncan Murdoch
>   
>> You could try this with Robin Hankins'
>> package "brobdingnag" which can handle bunches of bizarrely large 
>> numbers.
>>
>> What kind of process are you studying?
>> What kind of process generates values
>> such as exp(8/0.01) when other values
>> are much smaller?  Are these outliers
>> in an otherwise well-behaved
>> data set?  Perhaps then they need
>> to be set aside and investigated
>> separately, etc.
>>
>>
>> Steven McKinney
>>
>> Statistician
>> Molecular Oncology and Breast Cancer Program British Columbia Cancer 
>> Research Centre
>>
>> email: smckinney +at+ bccrc +dot+ ca
>>
>> tel: 604-675-8000 x7561
>>
>> BCCRC
>> Molecular Oncology
>> 675 West 10th Ave, Floor 4
>> Vancouver B.C. 
>> V5Z 1L3
>> Canada
>>
>>
>>
>>
>> -Original Message-
>> From: [EMAIL PROTECTED] on behalf of Duncan Murdoch
>> Sent: Fri 10/24/2008 4:04 PM
>> To: A.Noufaily
>> Cc: r-help@r-project.org
>> Subject: Re: [R] Computational problems in R
>>  
>> On 24/10/2008 12:42 PM, A.Noufaily wrote:
>> 
>>> Dear all,
>>>
>>> I would be grateful if anyone can help me with the following:
>>>
>>> My aim is to compute explicitely the sum S=A+B where 
>>> A=sum(exp(c_i/d)), i=1,...,n; B, c_i, and d are real numbers with 
>>> -Inf0.
>>> The problem is that when c_i/d >710 (for so

Re: [R] Computational problems in R

2008-10-24 Thread Steven McKinney
I suspect there's a deeper issue here.
sum(exp(yi)) when large yi occur is
problematic.  exp(yi) for yi>710 is
just a huge number, and summing additional
values only makes the overall sum larger as all
components of the summation are positive.
There's no way around that.

You could try this with Robin Hankins'
package "brobdingnag" which can handle
bunches of bizarrely large numbers.

What kind of process are you studying?
What kind of process generates values
such as exp(8/0.01) when other values
are much smaller?  Are these outliers
in an otherwise well-behaved
data set?  Perhaps then they need
to be set aside and investigated
separately, etc.


Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: smckinney +at+ bccrc +dot+ ca

tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3
Canada




-Original Message-
From: [EMAIL PROTECTED] on behalf of Duncan Murdoch
Sent: Fri 10/24/2008 4:04 PM
To: A.Noufaily
Cc: r-help@r-project.org
Subject: Re: [R] Computational problems in R
 
On 24/10/2008 12:42 PM, A.Noufaily wrote:
> Dear all,
> 
> I would be grateful if anyone can help me with the following:
> 
> My aim is to compute explicitely the sum S=A+B where A=sum(exp(c_i/d)),
> i=1,...,n;
> B, c_i, and d are real numbers with -Inf0.
> The problem is that when c_i/d >710 (for some i) R is setting
> exp(c_i/d) to be equal to +Inf and hence the whole summation S.
> So in simple cases where for example c_i=8 (for some i), and d=0.01 the
> whole summation is turning out to be infinite.
> Is there a way to get round that in R?
> Can anyone suggest any computational trick to calculate S when c_i/d>710
> (for some i)?

Work on a log scale.  Use the identity that

A+B = A*(1 + B/A)
 = exp(log(A) + log(1 + B/A))

(where you chose A to be the biggest term in the sum).

Duncan Murdoch

> 
> Any suggestions would be much appreciated.
> 
> Regards,
> 
> Amy
> 
> 
> 
> 
> 
> -
> The Open University is incorporated by Royal Charter (RC 000391), an exempt 
> charity in England & Wales and a charity registered in Scotland (SC 038302).
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] combining data from different datasets

2008-10-24 Thread Steven McKinney
If you are using regular R graphs
(i.e. not lattice or other library
graphics) try setting the margins
with the mar argument to par()

e.g.
par(mar = c(5, 10, 5, 1))

The four numbers specify the amount
of margin room on the bottom, left, top, right
respectively.  Set the left margin value large
enough to give your labels enough room.


HTH


Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: smckinney +at+ bccrc +dot+ ca

tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3
Canada




-Original Message-
From: [EMAIL PROTECTED] on behalf of Dr Eberhard W Lisse
Sent: Fri 10/24/2008 2:44 PM
To: R-help Mailing List
Cc: Dr Eberhard W Lisse
Subject: Re: [R] combining data from different datasets
 
It's Complicated® :-)-O

I pull the data from a postgresql table, but I am getting
there, thank you the help.

Another question, I am barplotting the continents horizontally,
ie the more participants the lrger the bars are. I have managed
the make the labels (names of the nontinents) to be horizontally
but the longer ones (North and South America) flow off the left
side (right justified).

So, how do I make the plot smaller, and move it to the right so
that the complete names appear? I can make the plot smaller in
inches but that doesn't scale if I enlarge the window.

Pointers to read will be fine, but so far I haven't found it :-)-O

el

On 24 Oct 2008, at 18:24 , Gabor Grothendieck wrote:

> On Fri, Oct 24, 2008 at 11:37 AM, Dr Eberhard W Lisse <[EMAIL PROTECTED]>  
> wrote:
>>
>> This looks very cool.
>>
>> But I must still make a plan with regards to country = "NA" (Namibia)
>> or continent = "NA" (North America)
>>
>> But there are the vignettes.
>>
>> el
>>
>
> NA and "NA" are not the same:
>
>> DF <- data.frame(x = c("a", "NA", NA))
>> DF
> x
> 1a
> 2   NA
> 3 
>>
>> is.na(NA)
> [1] TRUE
>> is.na("NA")
> [1] FALSE
>

--
Dr. Eberhard W. Lisse  \/ Obstetrician & Gynaecologist (Saar)
[EMAIL PROTECTED] el108-ARIN /   *   |   Telephone: +264 81 124 6733 (cell)
PO Box 8421 \ / Please send DNS/NA-NiC related e-mail
Bachbrecht, Namibia ;/ to [EMAIL PROTECTED]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help regarding levels

2008-10-03 Thread Steven McKinney
Your column of data has some
character data in it, perhaps
an Excel #VALUE! or a blank or
some such entry not strictly numeric.

When R reads in such a column, it
makes that column variable into
a 'factor' variable instead of a
numeric variable, because the values
are not all numeric.

You can specify 
stringsAsFactors = FALSE
to force R to leave character
data as is if you do not want
conversion to factors.

If there is Excel cruft in a variable column,
it will end up as a factor or character variable.
You can try to make it numeric with

 df$x <- as.numeric(as.character(df$x))

(assuming you read the data into a dataframe called df)
and R will replace the Excel cruft with
NA values.

Alternatively you can clean up the Excel data
before importing it.

HTH

Steve McKinney



-Original Message-
From: [EMAIL PROTECTED] on behalf of K3
Sent: Fri 10/3/2008 4:09 PM
To: r-help@r-project.org
Subject: [R] help regarding levels
 
When i try to extract a column of data from an excel file and assign
it to a variable , say x, it does assign the column of data as well as
different levels.
it looks something like this when i print.

[1] 6.91 5.89 7.44 8.82
80 Levels: 1.43 102.07 103.65 106.21 106.24 107.15 108.58 11.19 ...

so how does this levels come into picture and what do they do. I
couldnot run linear regression with x as a predictor just because of
this levels. please explain.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] print.data.frame : row.name = FALSE not having intendedeffect

2008-09-23 Thread Steven McKinney

Here's the current method for printing a data frame:


> print.data.frame
function (x, ..., digits = NULL, quote = FALSE, right = TRUE, 
row.names = TRUE) 
{
n <- length(row.names(x))
if (length(x) == 0L) {
cat("NULL data frame with", n, "rows\n")
}
else if (n == 0L) {
print.default(names(x), quote = FALSE)
cat("<0 rows> (or 0-length row.names)\n")
}
else {
m <- as.matrix(format.data.frame(x, digits = digits, 
na.encode = FALSE))
if (!isTRUE(row.names)) 
dimnames(m)[[1]] <- if (identical(row.names, FALSE)) 
rep.int("", n)
else row.names
print(m, ..., quote = quote, right = right)
}
invisible(x)
}

> 

so you can either set up your own copy of this
or make a matrix copy of your data frame (as
done in the function above) and set the row names
to ""


HTH

Steve McKinney



-Original Message-
From: [EMAIL PROTECTED] on behalf of Matthew Pettis
Sent: Tue 9/23/2008 6:55 PM
To: r-help@r-project.org
Subject: Re: [R] print.data.frame : row.name = FALSE not having intendedeffect
 
Never mind -- the answer is buried in my own question... I was looking
at documentation for version 2.7.2, and when I looked at the one for
2.6.2, I see the row.names option isn't in that release.

Any suggestions on how I can code around that in 2.6.2, so I don't
have to upgrade to 2.7.2 just yet?

Thanks,
Matt

On Tue, Sep 23, 2008 at 8:49 PM, Matthew Pettis
<[EMAIL PROTECTED]> wrote:
> Hi,
>
> Does anybody know if row.name = FALSE actually works in v2.6.2?
> Because it isn't working for me... here is some sample code and
> output:
>
> ++++++++++
>> print(x,row.names = FALSE)
>   party_abbr   candidate_name votes_candidate
> 2 DFLAMY KLOBUCHAR   1,278,849
> 5   R MARK KENNEDY 835,653
> 4  IPROBERT FITZGERALD  71,194
> 3  GP MICHAEL JAMES CAVLAN  10,714
> 1  CP   BEN POWERS   5,408
> 10 WI   WRITE-IN** 901
> 8  WI PETER IDUSOGIE**  29
> 6  WICHARLES ALDRICH**  15
> 9  WI REBECCA WILLIAMSON**   5
> 7  WI   JOHN ULDRICH**   4
>>
> ++++++++++
>
> Thanks,
> Matt
>
> --
> It is from the wellspring of our despair and the places that we are
> broken that we come to repair the world.
> -- Murray Waas
>



-- 
It is from the wellspring of our despair and the places that we are
broken that we come to repair the world.
-- Murray Waas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sort a data matrix by all the values and keep the names

2008-09-22 Thread Steven McKinney
Is something missing in the melt()?

> x<-data.frame(x1=c(1,7),x2=c(4,6),x3=c(8,2))
> require("reshape")
Loading required package: reshape
> dfm <- melt(x, id = c())
Error in if (!missing(id.var) && !(id.var %in% varnames)) { : 
  missing value where TRUE/FALSE needed
> dfm[order(dfm$value), ]
Error: object "dfm" not found
> x
  x1 x2 x3
1  1  4  8
2  7  6  2
> melt(x, id = c())
Error in if (!missing(id.var) && !(id.var %in% varnames)) { : 
  missing value where TRUE/FALSE needed
>


Steve McKinney


-Original Message-
From: [EMAIL PROTECTED] on behalf of hadley wickham
Sent: Mon 9/22/2008 5:47 PM
To: zhihuali
Cc: [EMAIL PROTECTED]
Subject: Re: [R] sort a data matrix by all the values and keep the names
 
On Mon, Sep 22, 2008 at 6:54 PM, zhihuali <[EMAIL PROTECTED]> wrote:
>
> Dear all,
>
> If I have a data frame  x<-data.frame(x1=c(1,7),x2=c(4,6),x3=c(8,2)):
>   x1  x2  x3
>   1 4  8
>   7 6  2
>
> I want to sort the whole data and get this:
> x1 1
> x3  2
> x2  4
> x2  6
> x1   7
> x3   8
>
>  If I do sort(X), R reports:
> Error in order(list(x1 = c(1, 7), x2 = c(4, 6), x3 = c(8, 2)), decreasing = 
> FALSE) :
>  unimplemented type 'list' in 'orderVector1'
>
> The only way I can sort all the data is by converting it to a matrix:
>> sort(as.matrix(x))
> [1] 1 2 4 6 7 8
>
> But now I lost all the names attributes.
>
> Is it possible to sort a data frame and keep all the names?

Here's one way:

dfm <- melt(x, id = c())
dfm[order(dfm$value), ]

Hadley

-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to keep up with R?

2008-09-19 Thread Steven McKinney
One thing that R does very well that SAS does not is
graphics - graphical portrayal of data is important,
and you can keep up with R by supplementing your
SAS analyses with R graphics.

Steve McKinney



-Original Message-
From: [EMAIL PROTECTED] on behalf of Barry Rowlingson
Sent: Fri 9/19/2008 12:36 AM
To: Wensui Liu
Cc: r-help
Subject: Re: [R] how to keep up with R?
 
2008/9/19 Wensui Liu <[EMAIL PROTECTED]>:
> Dear Listers,
>
> I've been a big fan of R since graduate school. After working in the
> industry for years, I haven't had many opportunities to use R and am mainly
> using SAS. However, I am still forcing myself really hard to stay close to R
> by reading R-help and books and writing R code by myself for fun. But by and
> by, I start realizing I have hard time to keep up with R and am afraid that
> I would totally forget how to program in R.
>
> I really like it and am very unwilling to give it up. Is there any idea how
> I might keep touch with R without using it in work on daily basis? I really
> appreciate it.
>

 How about doing some kind of presentation on R at your work? It's
possible that some of the old fossils don't even know about it at all,
and use SAS because to them the alternative is SPSS. Do some R
evangelization. Find a task that R does better than SAS (not
difficult) and illustrate that to your superiors. Then when they ask
how much a corporate R license is, you tell them it's free, or say
it'll cost them a 2% raise in your salary, or say it will cost them
your resignation if you are feeling brave!

 Sure you may be tied to SAS for some other reasons, but there's no
reason why you can't use R for other things. Work out how to get it
into your corporate framework. Encourage your colleagues to look at it
for their tasks. Enthuse.

 The good thing about training and evangelization is that at first you
don't need mad skillz at R to do it. I have trouble understanding some
of the tips on R-help (especially when do.call() is used), but you can
teach new people with a good knowledge of the basics, which you should
still have. Eventually the hope is that enough people use R at your
workplace to develop a community where everyone keeps everyone else on
their toes with R questions!

 Good luck!

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Different PCA results under Windows and Linux

2008-09-16 Thread Steven McKinney

Not likely that anyone can explain, as
there is not enough information in your
email.

Including the contents of the freqtest.txt file
was a good idea, as the posting guide suggests
(the posting guide is that clearly labeled bit
at the bottom that looks like this:
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
Check it out! It is cool.)

Additionally, include the command 
  sessionInfo() 
and its output from all machines you refer to
so maintainers know which versions of software
you are running.  Also, include the output you obtained
from your code (with your code being a self-contained 
and reproducible set of  R commands).

Finally, describe what the difference is and why
the difference is problematic (i.e. don't report
machine precision differences, or sign differences
for PCA results - PCA vector directions are arbitrary
modulo 180 degrees).

> I also tried mean(xrcc2) and sd(xrcc2) on both machines, the results are the
> same. 
> Please explain.

The R maintainers do an amazing job of creating
numerically stable platform-independent software,
so you get the same results almost everywhere.
(Thank you R core!)


HTH

Steve McKinney

-Original Message-
From: [EMAIL PROTECTED] on behalf of jathine
Sent: Tue 9/16/2008 2:19 PM
To: r-help@r-project.org
Subject: [R]  Different PCA results under Windows and Linux
 

I ran the following R script under both Linux and Windows, and got 2
different results.
Linux R version 2.7.1 and Windows R version 2.7.2.

> library(FactoMineR)
>x1=read.table("freqtest.txt",header=TRUE)
>xrcc2=x1[,1:8]
>p1=PCA(xrcc2, graph=FALSE)
>p1$var

freqtest.txt file lines of text :
M1 M2 M3 M4 M5 M6 M7 M8
-1 -1 -1 -1 -1 -1 -1 -1
0 0 0 0 -1 -1 1 1
-1 -1 -1 -1 -1 -1 -1 -1
0 0 0 0 -1 -1 1 1 

I also tried mean(xrcc2) and sd(xrcc2) on both machines, the results are the
same. 
Please explain.


-- 
View this message in context: 
http://www.nabble.com/Different-PCA-results-under-Windows-and-Linux-tp19520449p19520449.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using quasibinomial family in lmer

2008-09-16 Thread Steven McKinney
Hi Russell,


You might join in on the discussion on the R-sig-ME
mixed effects listserv

(see e.g. Doug Bates' message today (Sept 16 2008) titled 
Re: [R-sig-ME] glmer and overdispersed Poisson models)

There Prof. Bates suggests that older versions of lmer() may be doing
things more appropriately (though maybe not yet optimally).
(He also asks for input on how this might be
addressed - I'm not yet quasi-specialized enough to
figure it out but maybe someone reading this will
be.)

Here's what I get with a slightly older version of lmer - note that
my quasi-binomial t-value is not as extreme as yours.

Note also that for the output below, the ratio of the 
standard deviation estimates for the intercept estimate
> 0.1219/0.07441
[1] 1.638221
equals the residual standard deviation, so it appears
that for the quasibinomial case lmer or some lower
level routine estimates the scale parameter phi using the
residual standard deviation, though I have not delved
into the source code to verify this.  So some attempt
to modify scale appears to be in place for this version
of lmer.


library(lme4)
set.seed(12)
eta=rnorm(50)
p=exp(eta)/(1+exp(eta))
y=rbinom(50,20,p)/20 #IID overdispersed binomial-normal proportions
#y=rbinom(50,20,0.5)/20 #IID binomial(20,0.5)
Group=rep(c("A","B","C","D","E"),c(10,10,10,10,10))
lmer(y~1+(1|Group),weights=rep(20,50),family="binomial")
lmer(y~1+(1|Group),weights=rep(20,50),family="quasibinomial")
sessionInfo()

> library(lme4)
> set.seed(12)
> eta=rnorm(50)
> p=exp(eta)/(1+exp(eta))
> y=rbinom(50,20,p)/20 #IID overdispersed binomial-normal proportions
> #y=rbinom(50,20,0.5)/20 #IID binomial(20,0.5)
> Group=rep(c("A","B","C","D","E"),c(10,10,10,10,10))
> lmer(y~1+(1|Group),weights=rep(20,50),family="binomial")
Generalized linear mixed model fit using Laplace 
Formula: y ~ 1 + (1 | Group) 
 Family: binomial(logit link)
   AIC   BIC logLik deviance
 149.7 153.6 -72.87145.7
Random effects:
 Groups NameVariance  Std.Dev.
 Group  (Intercept) 0.0075745 0.087032
number of obs: 50, groups: Group, 5

Estimated scale (compare to  1 )  1.638542 

Fixed effects:
Estimate Std. Error z value Pr(>|z|)  
(Intercept) -0.137390.07441  -1.847   0.0648 .
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
> lmer(y~1+(1|Group),weights=rep(20,50),family="quasibinomial")
Generalized linear mixed model fit using Laplace 
Formula: y ~ 1 + (1 | Group) 
 Family: quasibinomial(logit link)
   AIC   BIC logLik deviance
 149.7 153.6 -72.87145.7
Random effects:
 Groups   NameVariance Std.Dev.
 Group(Intercept) 0.020336 0.14261 
 Residual 2.684818 1.63854 
number of obs: 50, groups: Group, 5

Fixed effects:
Estimate Std. Error t value
(Intercept)  -0.1374 0.1219  -1.127
> sessionInfo()
R version 2.7.1 Patched (2008-06-25 r45988) 
powerpc-apple-darwin8.11.1 

locale:
C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 

other attached packages:
[1] RMySQL_0.6-0  DBI_0.2-4 RODBC_1.2-3   lme4_0.99875-9   
[5] Matrix_0.999375-9 lattice_0.17-10  

loaded via a namespace (and not attached):
[1] grid_2.7.1
> 



Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: smckinney +at+ bccrc +dot+ ca

tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3
Canada




-Original Message-
From: [EMAIL PROTECTED] on behalf of [EMAIL PROTECTED]
Sent: Tue 9/16/2008 3:18 PM
To: r-help@r-project.org
Subject: [R] Using quasibinomial family in lmer
 
Dear R-Users,

I can't understand the behaviour of quasibinomial in lmer. It doesn't
appear to be calculating a scaling parameter, and looks to be reducing the
standard errors of fixed effects estimates when overdispersion is present
(and when it is not present also)! A simple demo of what I'm seeing is
given below. Comments appreciated?

Thanks,

Russell Millar
Dept of Stat
U. Auckland

PS. I'm using the latest version of lme4 (0.999375-26) with R 2.7.2.

> eta=rnorm(50)
> p=exp(eta)/(1+exp(eta))
> y=rbinom(50,20,p)/20 #IID overdispersed binomial-normal proportions
> #y=rbinom(50,20,0.5)/20 #IID binomial(20,0.5)
>
> Group=rep(c("A","B","C","D","E"),c(10,10,10,10,10))
>
> #library(lme4)
>
> lmer(y~1+(1|Group),weights=rep(20,50),family="binomial")
Generalized linear mixed model fit by the Laplace approximation
Formula: y ~ 1 + (1 | Group)
 AIC   BIC logLik deviance
 211 214.8 -103.5  207
Random effects:
 Groups NameVariance Std.Dev.
 Group  (Intercept) 0.072891 0.26998

Re: [R] bootstrapping - number of items to replace is not a multiple of replacement length

2008-09-09 Thread Steven McKinney
Hello,

Your theta() function is returning different
sets of coefficients depending on the results of
step().

You'll need to add code to theta() to figure
out which variables were selected, and store
them into the right positions of a vector
of length 20 (the apparent number of covariates
you describe below), so that your theta()
function always returns the same sized output.  

(Google

   stepwise regression random forest

and you'll get a number of hits
about using random forests instead
of stepwise regression, and pointers
about bootstrapping the random forest.)

HTH

Steve McKinney


> -Original Message-
> From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]
> On Behalf Of Gabriela Bucini
> Sent: Tuesday, September 09, 2008 5:02 PM
> To: r-help@r-project.org
> Subject: [R] bootstrapping - number of items to replace is not a
multiple
> ofreplacement length
> 
> Hello,
> 
> I'm new to boostrapping and I'd need some help to understand the error
> message that pops up when I run my script.
> 
> I have a data.frame with 73 lines and 21 column.
> I am running a stepwise regression to find the best model using the R
> function "step".
> I apply bootstrapping to obtain model coefficients.
> This is my script:
> 
> # "datare80" is the name of the data.frame and "woodycover" is the
> response
> variable
> theta <- function(datare80, indices) {
>  d <- datare80[indices, ]# allows boot to select
subsample
> datasets
>  full <- lm(d$woodycover~ ., data= d )
>  lmbroadst <- step(full, data=d , direction = "both", k=2,
> trace=0)
>  coefficients(lmbroadst)  # return coef. vector
>   }
> resb <- boot(data = datare80, statistic = theta, R=1000)
> 
> 
> 
> When I run it, I get these two messages:
> If I omit the last line "coefficients(lmbroadst)" in the function
> definition, I get :
> "Error in t.star[r, ] <- statistic(data, i[r, ], ...) :
> incorrect number of subscripts on matrix"
> If I have the last line "coefficients(lmbroadst)", then I get:
> "Error in t.star[r, ] <- statistic(data, i[r, ], ...) :
> number of items to replace is not a multiple of replacement
> length"
> 
> 
> Thank you very much for any help!
> 
> Gabriela
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lattice command equivalent to points

2008-09-02 Thread Steven McKinney


This is close, but maybe not optimal lattice coding.
I haven't yet figured out how to suppress the x axis
labeling.

bwplot(yield ~ 1|year, panel = function(x, y, ...){panel.bwplot(x, y, ..., pch 
= "|"); panel.points(x, mean(y), ...,  pch=17)}, data = barley, horizontal = 
FALSE, xlab = "")

Steve McKinney


-Original Message-
From: [EMAIL PROTECTED] on behalf of Surai
Sent: Tue 9/2/2008 5:53 PM
To: r-help@r-project.org
Subject: [R] lattice command equivalent to points
 
Hello,
This is my first post to R-help so forgive me if I'm committing a fatal error 
somehow.  Feel free to let me know.
 
My question is this:  I would like to add a triangle to represent the mean in a 
box plot generated with the bwplot command.   How do I do this?  I am able to 
do this using the boxplot and points command but points does not work with 
bwplot.    If you run the following code in R, you'll see that I'm trying to 
reproduce graphs from method 2 by modifying code from method 1.
 
Thank you,
Surai Jones
 
 
library(lattice)
attach(barley)
 
#method 1
bwplot(yield~year, pch="|") 
 
#method 2
boxplot(yield~year)
mean.values<-tapply(yield,year, mean)
points(1:2, mean.values, pch = 17)


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Newbie: Examples on functions callling a library etc.

2008-08-28 Thread Steven McKinney

Hi Ed, 

Here's a simple example showing your needs:


myfun <- function(n1, n2, n3) {
  mat1 <- matrix(rep(1), nrow = n1, ncol = 3)
  mat2 <- matrix(rep(2), nrow = n2, ncol = 4)
  mat3 <- matrix(rep(3), nrow = n3, ncol = 5)

  require("survival") ## make sure the package you need is loaded
  mypkgfun <- survival::is.Surv ## Use the '::' and ':::' extractors to get 
visible and hidden functions respectively from the package

  list(mat1 = mat1, mat2 = mat2, mat3 = mat3, mypkgfun = mypkgfun) ## Return 
the items in a list
}

## Now invoke the function
foo <- myfun(n1 = 1, n2 = 1, n3 = 5)

## and look at the returned results
foo

> myfun <- function(n1, n2, n3) {
+   mat1 <- matrix(rep(1), nrow = n1, ncol = 3)
+   mat2 <- matrix(rep(2), nrow = n2, ncol = 4)
+   mat3 <- matrix(rep(3), nrow = n3, ncol = 5)
+ 
+   require("survival")
+   mypkgfun <- survival::is.Surv
+ 
+   list(mat1 = mat1, mat2 = mat2, mat3 = mat3, mypkgfun = mypkgfun)
+ }
> 
> foo <- myfun(n1 = 1, n2 = 1, n3 = 5)
> 
> foo
$mat1
 [,1] [,2] [,3]
[1,]111

$mat2
 [,1] [,2] [,3] [,4]
[1,]2222

$mat3
 [,1] [,2] [,3] [,4] [,5]
[1,]33333
[2,]33333
[3,]33333
[4,]33333
[5,]33333

$mypkgfun
function (x) 
inherits(x, "Surv")


> 

HTH

Steve McKinney


-Original Message-
From: [EMAIL PROTECTED] on behalf of Eduardo M. A. M.Mendes
Sent: Thu 8/28/2008 5:43 PM
To: r-help@r-project.org
Subject: [R] Newbie:  Examples on functions callling a library etc.
 
Hello

 

R is pretty new to me. I need to write a function that returns three
matrices of different dimensions.  In addition, I need to call a function
from a contributed package with the function.   I have browsed several
manuals and docs but the examples on them are either very simple or
extremely hard to follow.

 

Many thanks

 

Ed

 

 

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] conditional with and operators

2008-08-19 Thread Steven McKinney


Did you try it with the vector '&' and operator?

d<-sapply(res,function(.df){(.df$TimesVisited[.df$Tick>912 & .df$Id>0])}) 

(The '&&' operator is designed for use in
e.g. if() clauses where you want a scalar logical
answer)

HTH

Steve McKinney

-Original Message-
From: [EMAIL PROTECTED] on behalf of Altaweel, Mark R.
Sent: Tue 8/19/2008 1:10 PM
To: r-help@r-project.org
Subject: [R] conditional with and operators
 
Hi,

I have a problem in which I am parsing data from a list.  I am simply trying to 
return data that has several conditions being true.  Here is my syntax below:

d<-sapply(res,function(.df){(.df$TimesVisited[.df$Tick>912 && .df$Id>0])})   
#res is the list, and I am trying to return a result that has two true 
conditions (that is the variable Tick should be greate than 912 and Id should 
be greater than 0

This returns an array of 10 with integer values of 0. This is the incorrect 
result

However, if I do the same syntax except I remove the && statement (i.e. the 
second conditional), then the result producing something that makes sense, 
which is all values that are Tick and greater than 912.

Can someone let me know how I can setup my data to be parsed so I can have 2 or 
multiple conditionals in my function that is looking at an array.

Thanks in advance.

Mark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Detecting duplicate values

2008-08-14 Thread Steven McKinney

> -Original Message-
> From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]
> On Behalf Of [EMAIL PROTECTED]
> Sent: Thursday, August 14, 2008 2:00 PM
> To: r-help@r-project.org
> Subject: [R] Detecting duplicate values
> 
> This is another "how do I do it" type of question. It seems that for a
> function that I have it has a hard time with lists that have a single
> repeated value. I want a function (or expression)  that will detect
this
> condition. I want to detect:
> 
> c(2,2,2,2,2)
> OR
> c(1)
> 
> The following is OK
> 
> c(2,3,3,2)
> OR
> c(1,2,1)
> 
> These lists are "OK" because they have another value in the list. In
other
> words I want to detect if there is only one element in the list that
> repeats 1 or more times.
> 
> Will length(table(a)) = 1 always work? If so I have answered my own
> question. If not please bear with me as I have missed a case.

Consider also 
length(unique(a)) == 1

Steve McKinney

> 
> Thank you.
> 
> Kevin
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] missing TRUE/FALSE error in conditional construct

2008-08-13 Thread Steven McKinney


> -Original Message-
> From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]
> On Behalf Of rcoder
> Sent: Wednesday, August 13, 2008 3:17 PM
> To: r-help@r-project.org
> Subject: [R] missing TRUE/FALSE error in conditional construct
> 
> 
> Hi everyone,
> 
> I posted something similar to this in reply to another post, but there
> seems
> to be a problem getting it onto the board, so I'm starting a new post.
> 
> I am trying to use conditional formatting to select non-zero and
non-NaN
> values from a matrix and pass them into another matrix. The problem is
> that
> I keep encountering an error message indicating the ":missing value
where
> TRUE/FALSE needed "
> 
> My code is as follows:
> 
> ##Code Start
> mat_zeroless<-matrix(NA,5000,2000) #generating holding matrix
> 
> ##Assume source matrix containing combination of values, NaNs and
zeros##
> for (j in 1:5000)
> {
> for (k in 1:2000)
> {
> if(mat[j,k]!=0 & !is.NaN(mat[j,k])) {mat_zeroless[j,k]<-mat[j,k]}
> }
> }
> ##Code End
> 
> Error in if (mat[j,k] !=0 & !is.NaN(mat[j,k])) { :missing value where
> TRUE/FALSE needed
> 
> I'm not sure how to resolve this.

This seems to do what you appear to need,
no loops required (always best whenever possible):

> set.seed(123)
> mat <- matrix(sample(1:10), nrow = 5)
> mat
 [,1] [,2]
[1,]31
[2,]8   10
[3,]49
[4,]72
[5,]65
> is.na(mat) <- c(2, 3)
> mat
 [,1] [,2]
[1,]31
[2,]   NA   10
[3,]   NA9
[4,]72
[5,]65
> mat[5,] <- 0
> mat
 [,1] [,2]
[1,]31
[2,]   NA   10
[3,]   NA9
[4,]72
[5,]00
> matno0 <- matrix(NA, nrow = nrow(mat), ncol = ncol(mat))
> matno0
 [,1] [,2]
[1,]   NA   NA
[2,]   NA   NA
[3,]   NA   NA
[4,]   NA   NA
[5,]   NA   NA
> mat[!is.na(mat) & !(mat == 0)]
[1]  3  7  1 10  9  2
> matno0[!is.na(mat) & !(mat == 0)] <- mat[!is.na(mat) & !(mat == 0)]
> matno0
 [,1] [,2]
[1,]31
[2,]   NA   10
[3,]   NA9
[4,]72
[5,]   NA   NA
>

Other issues:

The error messages you are seeing generally
are happening because you are generating
non-scalar TRUE / FALSE outcomes or
NA outcomes in your if() clause.

An if() clause should generate scalar
TRUE or FALSE only.

To this end, the S language provides
the operators '&&' (scalar AND)
and '||' (scalar OR) for use in
logical scalar clauses.  Further,
if the first argument to '&&' tests
FALSE, the second is not even evaluated,
so you could have done

if ( !is.na(mat[j, k]) && mat[j, k] != 0 ) { blah }

In this case, once the NA is found, the if() clause
evaluation is over and you don't have to worry about
the NA that will result from the second argument
mat[j, k] != 0

 

> matno0 <- matrix(NA, nrow = nrow(mat), ncol = ncol(mat))
> matno0
 [,1] [,2]
[1,]   NA   NA
[2,]   NA   NA
[3,]   NA   NA
[4,]   NA   NA
[5,]   NA   NA
> for ( j in 1:5 ) {
+  for ( k in 1:2 ) {
+   if ( !is.na(mat[j, k]) && mat[j, k] != 0 ) { matno0[j, k] <- mat[j,
k] }
+  }
+ }
> 
> matno0
 [,1] [,2]
[1,]31
[2,]   NA   10
[3,]   NA9
[4,]72
[5,]   NA   NA
> 

HTH

Steve McKinney

> 
> rcoder
> --
> View this message in context:
http://www.nabble.com/missing-TRUE-FALSE-
> error-in-conditional-construct-tp18972244p18972244.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Conditional statement used in sapply()

2008-08-13 Thread Steven McKinney


> -Original Message-
> From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]
> On Behalf Of Altaweel, Mark R.
> Sent: Wednesday, August 13, 2008 3:03 PM
> To: r-help@r-project.org
> Subject: [R] Conditional statement used in sapply()
> 
> Hi,
> 
> I have data stored in a list that I would like to aggregate and
perform
> some basic stats. However, I would like to apply conditional
statements so
> that not all the data are used.  Basically, I want to get a specific
> variable, do some basic functions (such as a mean), but only get the
data
> in each element's data that match the condition. The code I used is
below:
> 
> > result<-sapply(res, function(.df) {   #res is the list containing
file
> data
> + if(.df$Volume>0)mean(.df$Volume)  #only have the mean function
calculate
> on values great than 0
> + })
> 

You probably want something such as
result<-sapply(res, function(.df) {  mean(.df$Volume[.df$Volume>0]) })  

HTH
Steve McKinney

> 
> I did get a numeric output; however, when I checked the output value
the
> conditional was ignored (i.e. it did not do anything to the
calculation)
> 
> I also obtained these warning statements:
> 
> Warning messages:
> 1: In if (.df$Volume > 0) mean(.df$Volume) :
>   the condition has length > 1 and only the first element will be used
> 2: In if (.df$Volume > 0) mean(.df$Volume) :
>   the condition has length > 1 and only the first element will be used
> 
> Please let me know what am I doing wrong and how can I apply a
conditional
> statement to the sapply function.
> 
> Thanks
> 
> Mark
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ignoring zeros or converting to NA

2008-08-12 Thread Steven McKinney


 
 
The help page for is.na()
is worth reading repeatedly.

> -Original Message-
> From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]
> On Behalf Of stephen sefick
> Sent: Tuesday, August 12, 2008 6:53 PM
> To: Charles C. Berry
> Cc: Mike Prager; [EMAIL PROTECTED]
> Subject: Re: [R] ignoring zeros or converting to NA
> 
> I have been reading this thread and I am having a hard interpreting
> what these mean.  I know that the result is that all of the values
> that are zero in a are replaced by NA.  Let me try and write it out
> 
> is.na(a[a==0] ) <- TRUE
> you pull out of a all of the times that are equal to zero then is.na
> tests and returns false then all of the false values are set to true?

Not quite - the function 'is.na<-' needs an index vector
as input.  So my understanding is that all the values in 'a'
that are zero will have their value set to NA, because
R will replicate the TRUE (vector of length 1) to
a vector of length (sum(a==0)), this being the index vector.

> 
> is.na(a) <- a==0
> make values in a NA  when a is 0?

Right.  a==0 generates a logical vector with length
(length(a)) with value 'TRUE' wherever 'a' has a
zero entry. 'is.na<-' will then set those entries to NA.
The entries where 'a==0' is FALSE will not have their
values changed to NA.

The version ' is.na(a[a==0] ) <- TRUE '
is doing the same thing, the logical index vector
for ' a[a==0] ' consists of all 'TRUE'.

HTH

Steve McKinney

> 
> is this right?  what the logic if not?
> thanks
> 
> Stephen Sefick
> 
> 
> On Tue, Aug 12, 2008 at 6:32 PM, Charles C. Berry
<[EMAIL PROTECTED]>
> wrote:
> > On Tue, 12 Aug 2008, Mike Prager wrote:
> >
> >> rcoder <[EMAIL PROTECTED]> wrote:
> >>
> >>> I have a matrix that has a combination of zeros and NAs. When I
> perform
> >>> certain calculations on the matrix, the zeros generate "Inf"
values.
> Is
> >>> there a way to either convert the zeros in the matrix to NAs, or
only
> >>> perform the calculations if not zero (i.e. like using something
> similar
> >>> to
> >>> an !all(is.na() construct)?
> >>
> >> Is this what you are looking for?
> >>
> >>> # make some data
> >>> a = matrix(c(rep(0,6), rep(2,6)), nrow = 4)
> >>> a
> >>
> >>[,1] [,2] [,3]
> >> [1,]002
> >> [2,]002
> >> [3,]022
> >> [4,]022
> >>>
> >>> # change zero to NA
> >>> is.na(a[a==0] ) <- TRUE
> >
> > Or
> >is.na(a) <- a==0
> >
> > Chuck
> >
> >>> a
> >>
> >>[,1] [,2] [,3]
> >> [1,]   NA   NA2
> >> [2,]   NA   NA2
> >> [3,]   NA22
> >> [4,]   NA22
> >>
> >> --
> >> Mike Prager, NOAA, Beaufort, NC
> >> * Opinions expressed are personal and not represented otherwise.
> >> * Any use of tradenames does not constitute a NOAA endorsement.
> >>
> >> __
> >> R-help@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> > Charles C. Berry(858) 534-2098
> >Dept of Family/Preventive
> > Medicine
> > E mailto:[EMAIL PROTECTED]   UC San Diego
> > http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego
92093-
> 0901
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> 
> 
> 
> --
> Let's not spend our time and resources thinking about things that are
> so little or so large that all they really do for us is puff us up and
> make us feel like gods. We are mammals, and have not exhausted the
> annoying little problems of being mammals.
> 
>   -K. Mullis
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] source a script file straight from a subversion repository

2008-08-01 Thread Steven McKinney


Thanks to Duncan Murdoch and Marc Schwartz for their
excellent help.

As we don't yet have the apache http: interface to svn
running yet, 'svn export' is the access mechanism.

As I don't want to leave stale files lying around,
or clobber a local file should there be one with
the same name, R's tempfile() and Marc's ideas
work very well.  


  ## Get a local copy of the file from the
  ## version control repository svn on myrepo.xxx.org
  tempfilename <- tempfile()
  shellcmd <-
paste("svn export ",
  "svn://myrepo.xxx.org/opt/svn/repos/",
  "Projects/SMcode/tags/v1.0/SMfuns.R ",
  tempfilename,
  sep = "")
  system(shellcmd)
  if (file.exists(tempfilename) ) {
source(tempfilename)
unlink(tempfilename)
  } else {
stop("Can not get copy of SMfuns.R")
  }

I can wrap this in a function where
needed as Marc suggests.

Thanks again for the help

Best

Steve McKinney


> -Original Message-
> From: Marc Schwartz [mailto:[EMAIL PROTECTED]
> Sent: Fri 8/1/2008 5:42 PM
> To: Steven McKinney
> Cc: r-help@r-project.org
> Subject: Re: [R] source a script file straight from a subversion repository
>  
> on 08/01/2008 06:49 PM Steven McKinney wrote:
> > Hi useRs
> > 
> > I'm trying to figure out how to source
> > an R script file straight from a subversion
> > repository, without having to put a copy
> > of the script into the local working directory.
> > 
> > Has anyone done this?  
> > 
> > Something such as
> > 
> >   source(file =  paste("svn://myrepo.xxx.org/opt/svn/repos/",
> >  "/Projects/SMcode/tags/v1.0/SMfuns.R",
> >  sep = ""))
> > 
> > though this does not work.
> > 
> > Perhaps I need a connection of some sort
> > instead of just a text string for the
> > file argument?
> > 
> > I'm not sure if this can work, and have not
> > found an example searching through the archives
> > and help manuals as yet.
> > 
> > If anyone has done this and can offer any
> > tips I'd appreciate it.
> 
> The problem is that the files in a svn repo are not stored as directly 
> addressable files. They are stored in a database, typically using the 
> Berkeley DB, but will use FSFS if NFS is to be used. You can't get to 
> them without going through the svn interface (either CLI or one of the 
> GUIs or something like Trac).
> 
> Thus, you either have to do a checkout ('svn co'), which means an entire 
> directory tree or sub-tree, or do an export ('svn export') to access a 
> single file. In the former case, you get a 'working copy' of the tree 
> which can be managed by svn. In the latter case, you get a single file, 
> but one that is not under svn control.
> 
> To do what you want, some variation of:
> 
> 1. 'svn export' the file
> 2. source() the file
> 3. delete the file
> 
> would be required. You can wrap all of that in a function to shield the 
> individual steps, but at minimum, the 'svn export' would be required to 
> get to the single file locally. You can call the 'svn export' using the 
> R system() function, thus doing it all within an R session.
> 
> So something like:
> 
> SVNSource <- function(FullPathToSVNFileName)
> {
>cmd <- paste("svn export", FullPathToSVNFileName)
>system(cmd)
>source(basename(FullPathToSVNFileName))
>file.remove(basename(FullPathToSVNFileName))
> }
> 
> The basename() R function returns just the file name, stripping the path 
> component.
> 
> Thus, given your example:
>  
> SVNSource("svn://myrepo.xxx.org/opt/svn/repos/Projects/SMcode/tags/v1.0/SMfuns.R")
> 
> should hopefully work, but I would test it all to be sure.
> 
> This presumes that the command:
> 
> $ svn export 
> svn://myrepo.xxx.org/opt/svn/repos/Projects/SMcode/tags/v1.0/SMfuns.R
> 
> works from the CLI to begin with to get the file locally.
> 
> HTH,
> 
> Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] source a script file straight from a subversion repository

2008-08-01 Thread Steven McKinney

Hi useRs

I'm trying to figure out how to source
an R script file straight from a subversion
repository, without having to put a copy
of the script into the local working directory.

Has anyone done this?  

Something such as

  source(file =  paste("svn://myrepo.xxx.org/opt/svn/repos/",
 "/Projects/SMcode/tags/v1.0/SMfuns.R",
 sep = ""))

though this does not work.

Perhaps I need a connection of some sort
instead of just a text string for the
file argument?

I'm not sure if this can work, and have not
found an example searching through the archives
and help manuals as yet.

If anyone has done this and can offer any
tips I'd appreciate it.

Best

Steve McKinney

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sort rows of matrix by rows of another matrix

2008-07-31 Thread Steven McKinney

I came up with the same solution as Mark Leeds.

> a1roworders <- t(apply(a1, 1, order))
> a2ord <- t(sapply(seq(nrow(a1)), function(x) a2[x, a1roworders[x,]] ))
> a2ord
 [,1] [,2] [,3]
[1,]  102  101  103
[2,]  102  101  103
[3,]  103  101  102
[4,]  101  102  103

Mark's question about the orientation of
apply() output is common, and a source of
confusion for many who are used to matrix
results that just show up in the right orientation
from most other S-language function output.

The help page for apply() states
 "If each call to FUN returns a vector of length n, 
  then apply returns an array of dimension 
  c(n, dim(X)[MARGIN]) if n > 1."

So, whenever you use apply() on
margin 1 (the rows) and your function
returns a result for each column, each
call to FUN returns a vector of length
n = ncol(X), and this dimension becomes
the row dimension of the result (i.e.
column dimensions and row dimensions
appear mysteriously switched).

Since apply() can be used on multiple
margins of arrays with many margins, 
some standard output format had to be
set up, and the format is stated
above.  

So this is just one of those bits of
S language trivia that you have to file
away in the grey matter - since apply()
can be used on arbitrary arrays, it has
to have a standard output format, and
that standard appears to transpose your
result matrix when you do column-wise
stuff to the rows of your two-dimensional
matrix.

The old Blue Book ("The New S Language")
showed examples

 apply(x, 2, sort) # sort columns of x
 t(apply(x, 1, sort)) # transpose result of row sort

and commented

 "The sorting examples show the difference between row and
  column operations when the results returned by FUN are vectors.
  The returned value becomes the  dimension of the result,
  hence the transpose is necessary with row sorts."

This was a useful comment and pointed this issue
out plainly.  I'd vote for this set of examples
to be in R's help page for apply(), with the comment
as well, e.g. 

 apply(x, 2, sort) # sort columns of x
 t(apply(x, 1, sort)) # transpose result of row sort

## The sorting examples show the difference between row and
## column operations when the results returned by FUN are vectors.
## The returned value becomes the  dimension of the result,
## hence the transpose is necessary with row sorts."

would be useful examples and comments for R's apply()
help page.

HTH

Steve McKinney





-Original Message-
From: [EMAIL PROTECTED] on behalf of [EMAIL PROTECTED]
Sent: Thu 7/31/2008 3:34 PM
To: Johnson, Eric A. (Seattle)
Cc: r-help@r-project.org
Subject: Re: [R] sort rows of matrix by rows of another matrix
 
below is another way ( maybe the same ?  )but with an extra line to make 
roworder. i'm also not clear on why I  have to take the transpose of 
the result that
comes back form the apply call. in ?apply, it says that the function 
tries to convert the result back to a matrix. that's fine but why does 
it do
it in the opposite way from the way the data in sent in ( i.e : by row 
). if someone could explain that, i'd appreciate it.


a1 <- structure(c(7, 4, 4, 0, 6, 2, 7, 3, 8, 4, 2, 8), .Dim = c(4L, 3L))
a2 <- structure(c(101L, 101L, 101L, 101L, 102L, 102L, 102L, 102L, 103L,
103L, 103L, 103L), .Dim = c(4L, 3L))

roworder <- t(apply(a1,1,order))

a3 <- t(sapply(1:nrow(a2),function(.rowind) {
   a2[.rowind,roworder[.rowind,]]
}))

print(a3)



On Thu, Jul 31, 2008 at  6:05 PM, Johnson, Eric A. (Seattle) wrote:

> If you're not adverse to cbind-ing a1 and a2, you can use this:
>
> a1a2 <- cbind(a1, a2)
>
> a3 <- t(apply(a1a2, 1, function(x) x[order(x[1:ncol(a1)])+ncol(a1)]))
>
> Eric
>
>
> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED]
> On Behalf Of Timothy W. Hilton
> Sent: Thursday, July 31, 2008 2:19 PM
> To: r-help@r-project.org
> Subject: [R] sort rows of matrix by rows of another matrix
>
> Hello all,
>
> I am trying to sort rows of one matrix by rows of another.  Given a1 
> and
> a2:
>
> --
>> a1
>  [,1] [,2] [,3]
> [1,]768
> [2,]424
> [3,]472
> [4,]038
>
> a1 <-
> structure(c(7, 4, 4, 0, 6, 2, 7, 3, 8, 4, 2, 8), .Dim = c(4L, 3L))
>
>> a2
>  [,1] [,2] [,3]
> [1,]  101  102  103
> [2,]  101  102  103
> [3,]  101  102  103
> [4,]  101  102  103
>
> a2 <-
> structure(c(101L, 101L, 101L, 101L, 102L, 102L, 102L, 102L, 103L, 
> 103L, 103L, 103L), .Dim = c(4L, 3L))
> --
>
> I want to get a3:
>
>> a3
>  [,1] [,2] [,3]
> [1,]  102  101  103
> [2,]  102  101  103
> [3,]  103  101  102
> [4,]  101  102  103
>
> where the rows of a3 are the rows of a2 sorted according to the rows 
> of
> a1.
>
> I can get the necessary sorting index:
>> apply(a1, 1, order)
>  [,1] [,2] [,3] [,4]
> [1,]2231
> [2,]1112
> [3,]3323
>
> and I can get the rows of a1 sorted according to the rows of a1:
>> t(apply(a1, 1, function(x) x[order(x)]))
>  [,1] [,2] [,3]
> [1,]6

Re: [R] 64-bit R on Mac OS X 10.4.5

2008-07-25 Thread Steven McKinney

Hello

I haven't found better instructions, it's just
not an easy thing to do.

You might also consider joining the r-sig-mac group
and reviewing threads there for additional information.

Rather than try to configure, make and install with
one giant command, I'd suggest breaking the task down
until you have worked out all the details.  Then you
can build more routinely with giant commands such as
the one you report below.

Note that the page of the URL you posted says this:
"R on Mac OS X 10.5 (Leopard)"

so look around the r.research.att.com
website for older 10.4 related instructions.
OS X 10.5 is different enough from 10.4 that
copying and pasting the 10.5 instructions
will not work everywhere.

You don't say what kind of computer you have
(Intel or PowerPC), if you have a power pc
then -arch x86_64 is not right.  So do
report your hardware configuration as well
as your operating system configuration.

Try breaking up the build process into steps
and review the messages generated at each step.

e.g. the configure step:  Just do these bits

cd rd64
../R-devel/configure SHELL='/bin/bash' \
r_arch=x86_64 \
CC="gcc -arch x86_64 -std=gnu99" \
CXX="g++ -arch x86_64" \
OBJC="gcc -arch x86_64" \
F77="gfortran -arch x86_64" \
FC="gfortran -arch x86_64" \
--with-system-zlib \
--with-blas='-framework vecLib' \
--with-lapack 1> configure.R.txt 2>&1

The end bits of this version of the
configure command redirect output and
error messages to a file that you can then
read, to see which bits you are missing and
which bits cause problems.  It's tough to catch those
as they scroll by in a terminal window.

If configure runs without any signs of trouble,
try the make.

make 1> make.R.txt 2>&1

Then you can review all the make output for signs
of problems and error messages.


Here's a configure command that worked for me to
build a 64-bit R on 10.4 a while back:


./configure --host=powerpc64-apple-darwin8.10.0 
--build=powerpc64-apple-darwin8.10.0 \
--prefix=/usr/local/lib64 'CC=gcc-4.0 -arch ppc64' 'CXX=g++ -arch ppc64' \
'FC=gfortran-4.0 -arch ppc64' 'F77=gfortran-4.0 -arch ppc64' \
'CFLAGS=-g -O3 -mtune=G5 -mcpu=G5' 'FFLAGS=-g -O3 -mtune=G5 -mcpu=G5' \
'LDFLAGS=-arch ppc64 -m64 -L/usr/local/lib' 'CXXFLAGS=-g -O3 -mtune=G5 
-mcpu=G5' \
'FCFLAGS=-g -O3 -mtune=G5 -mcpu=G5' --disable-R-framework --enable-R-shlib \
'--with-blas=-framework vecLib' --with-lapack --without-iconv 1> 
configure.R.txt 2>&1


and also note that you need to have root privileges when doing the
make install, so you either need to run the command as root
(not so common on Mac OS X) or run

sudo make install

and enter your password etc.

HTH

Steve McKinney




-Original Message-
From: [EMAIL PROTECTED] on behalf of joseph
Sent: Fri 7/25/2008 5:07 PM
To: r-help@r-project.org
Cc: r-help@r-project.org
Subject: [R] 64-bit R on Mac OS X 10.4.5
 
Hello
I have a Mac OS X 10.4.5.  I am trying to build a 64-bit R by following the 
directions on this page: http://r.research.att.com/building.html

r_arch=x86_64 \
CC="gcc -arch x86_64 -std=gnu99" \
CXX="g++ -arch x86_64" \
OBJC="gcc -arch x86_64" \
F77="gfortran -arch x86_64" \
FC="gfortran -arch x86_64"

PATH=/usr/X11/bin:/usr/local/bin:$PATH
export PATH
LANG=C
export LANG
svn co https://svn.r-project.org/R/trunk R-devel
# you may have to accept a certificate here
cd R-devel
tools/rsync-recommended
cd ..
# got the sources, on to building 64-bit R
mkdir rd64
cd rd64
../R-devel/configure SHELL='/bin/bash' \
r_arch=x86_64 \
CC="gcc -arch x86_64 -std=gnu99" \
CXX="g++ -arch x86_64" \
OBJC="gcc -arch x86_64" \
F77="gfortran -arch x86_64" \
FC="gfortran -arch x86_64" \
--with-system-zlib \
--with-blas='-framework vecLib' --with-lapack && \
make -j4 && \
make check && \
make install
cd ..when I try to run it by typing R, it gives me the following error: -bash: 
R: command not found


Can any body help me to solve this problem or direct me to better step-by-step 
instructions.
Thanks
Joseph


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calling LISP programs in R

2008-07-23 Thread Steven McKinney

Have you reviewed the old
xlisp machinery that used to
be in R?  Check out the RXLisp
library at
http://www.omegahat.org/RXLisp/
and see if this will work.  The
PDF 
http://www.omegahat.org/RXLisp/examples.pdf
reviews calling xlisp from R.

HTH

Steve McKinney


-Original Message-
From: [EMAIL PROTECTED] on behalf of francogrex
Sent: Wed 7/23/2008 7:06 AM
To: r-help@r-project.org
Subject: [R]  Calling LISP programs in R
 

I have written some programs in Common Lisp and I have been using SAS to pipe
those programs to my lisp compiler in batch mode by using the %xlog and
%xlst SAS commands. I wonder if there is in R a similar way to pipe commands
to LISP so that all my work would be concentrated in R even when I have to
call a LISP program? I have looked at the foreign library but this seems to
adjust data types not for piping commands in batch mode. 

Here is a simple program (example) to generate random normal variables from
SAS to LISP; of course it's a toy example because there is no need for a
LISP routine in this particular case. I hope R has a similar feature. Thanks
DATA _NULL_;
FILE 'c:\cl.lisp' LRECL=1024;
PUT "(defun run (num Mn SD)";
PUT   "(setq mix  (list nil))";
PUT   "(dotimes (n num)";
PUT"(setq u (- (* 2 (random 1.0)) 1)"; 
PUT"v (- (* 2 (random 1.0)) 1)"; 
PUT"w (+ (expt u 2) (expt v 2)))";
PUT"(when (< w 1)";
PUT  "(progn";
PUT"(setq z (sqrt (/ (* -2 (log w)) w))";
PUT"x (* u z)";
PUT"y (* v z)";
PUT "mix (append (list x)  mix)";
PUT "mix (append (list y)  mix)";
PUT   "(setq mix (remove nil mix)";
PUT   "mixnew (loop for x in mix append (list(+ Mn (* SD x)";
PUT   "(print mixnew)";
PUT   "(flet(( mean(zlist)";
PUT "(setq sum (loop for x in zlist sum x))";
PUT "(setq m (/ sum (length zlist)";
PUT   '(pprint (list "The mean is:" (mean mixnew';
PUT   "(flet((var(zlist)";
PUT "(setq sumsq (loop for x in zlist sum (* (- x m) (- x m)))";
PUT "v (/ sumsq (- (length zlist) 1)";
PUT   "(setq std (sqrt (var mixnew";
PUT   '(pprint (list "The Standard Deviation is:" std)))';
PUT "(run 500 41 17.5)";
RUN;
%xlog(CL);

-- 
View this message in context: 
http://www.nabble.com/Calling-LISP-programs-in-R-tp18611512p18611512.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >