Re: [R] [R-SIG-Mac] morley object?
Hi Ann Looks like a typo - I see "moreley.tab" that should be "morley.tab". Works for me after correcting that. > filepath <- system.file("data", "moreley.tab", package="datasets") > filepath [1] "" > > filepath <- system.file("data", "morley.tab", package="datasets") > filepath [1] "C:/PROGRA~1/R/R-30~1.1/library/datasets/data/morley.tab" > mm <- read.table(filepath) > head(mm) Expt Run Speed 001 1 1 850 0021 2 740 0031 3 900 0041 4 1070 0051 5 930 0061 6 850 > Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre > -Original Message- > From: r-sig-mac-boun...@r-project.org [mailto:r-sig-mac-bounces@r- > project.org] On Behalf Of Ann Summy > Sent: November-22-13 4:56 PM > To: r-sig-mac; r-help > Subject: [R-SIG-Mac] morley object? > > Brand new to R and not a sys admin or programmer type. I am trying to work > through the intro tutorial in the sample session here > http://cran.r-project.org/doc/manuals/R-intro.html#A-sample-session. > > I have an error when following the instructions below: > > > The next section will look at data from the classical experiment of > > Michelson to measure the speed of light. This dataset is available in the > > morley object, but we will read it to illustrate the read.table function. > > filepath <- system.file("data", "morley.tab" , > package="datasets")filepath > > > > Get the path to the data file. > > file.show(filepath) > > > > Optional. Look at the file. > > mm <- read.table(filepath)mm > > > > Read in the Michelson data as a data frame, and look at it. There are > five > > experiments (column Expt) and each has 20 runs (column Run) and sl is the > > recorded speed of light, suitably coded. > > > Here is my result: > > > > filepath <- system.file("data", "moreley.tab", package="datasets") > > > filepath > > [1] "" > > > file.show(filepath) > > > mm <- read.table(filepath) > > Error in read.table(filepath) : no lines available in input > > In addition: Warning message: > > In file(file, "rt") : > > file("") only supports open = "w+" and open = "w+b": using the former > > > Is it because I am missing this "morley object"? How do I see a list of > what objects I have installed? I installed > R-2.15.3.pkg<http://cran.r-project.org/bin/macosx/old/R-2.15.3.pkg>, > because > I am teaching myself on a 32-bit Macbook. > > [[alternative HTML version deleted]] > > ___ > R-SIG-Mac mailing list > r-sig-...@r-project.org > https://stat.ethz.ch/mailman/listinfo/r-sig-mac __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Change the text size of the title in a legend of a R plot.
> -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf Of Victor Gabillon > Sent: April-28-11 8:22 PM > To: r-help@r-project.org > Subject: [R] Change the text size of the title in a legend of a R plot. > > Hello, > > Is it possible to change the text size of the title in a legend of a R plot? > > I tried to directly change the title.cex argument but it seems not to work. > > Trying : > > Horizo <- c(1,2,6,10,20) > legtext <- paste(Horizo,sep="") > legend("topleft", legend=legtext,col=col,text.col=col,lwd=lwd, > lty=lty,cex=1.1,ncol=3,title = "Horizons",title.col ="black",title.cex=1.4) I haven't found any cex argument that works for just the legend title, but you can get some modification of the title with the expression argument: legend("topleft", legend=legtext,col=col,text.col=col,lwd=lwd, lty=lty,cex=1.1,ncol=3, title = expression(bold("Horizons")),title.col="black") Does that help? Otherwise, you can of course figure out which functions do the legend plotting, copy and modify those to get a title cex in place. Steve McKinney > > gives the following error (sorry in french): > Erreur dans legend("topleft", legend = legtext, col = col, text.col = > col, : >argument(s) inutilisé(s) (title.cex = 1.4) > > saying title.cex argument as been ignored. > > Thank you for helping. > > Victor > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] is this an ANOVA ?
> -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf Of Ubuntu Diego > Sent: April-12-11 5:10 PM > To: r-help@r-project.org > Subject: [R] is this an ANOVA ? > > Hi all, > I have a very easy questions (I hope). I had measure a property of > plants, growing in three > different substrates (A, B and C). The rest of the conditions remained > constant. There was very high > variation on the results. > I want to do address, whether there is any difference in the response > (my measurement) from > substrate to substrate? > > x<-c('A','A','A','A','A','B','B','B','B','B','C','C','C','C','C') # Substrate > type > y <- c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15) # Results of the measurement > MD<-data.frame(x,y) > > I wrote a linear model for this: > > summary(lm(y~x,data=MD)) > > This is the output: > > Call: > lm(formula = y ~ x, data = MD) > > Residuals: >Min 1Q Median 3QMax > -2.000e+00 -1.000e+00 5.551e-17 1.000e+00 2.000e+00 > > Coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) 3. 0.7071 4.243 0.001142 ** > xB5. 1. 5.000 0.000309 *** > xC 10. 1. 10.000 3.58e-07 *** > --- > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > > Residual standard error: 1.581 on 12 degrees of freedom > Multiple R-squared: 0.8929, Adjusted R-squared: 0.875 > F-statistic:50 on 2 and 12 DF, p-value: 1.513e-06 > > I conclude that there is an effect of substrate type (x). > NOW the questions : > 1) Do the fact that the all p-values are significant means that > all the groups are > different from each other ? No, the small p-values indicate that the associated estimate appears to be significantly different from zero. You can use the package "multcomp" to do multiple comparisons. > require("multcomp") > lma <- aov(y ~ x, data = MD) > lmamc <- glht(lma, linfct = mcp(x = "Tukey")) > ci.lma <- confint(lmamc) > ci.lma Simultaneous Confidence Intervals Multiple Comparisons of Means: Tukey Contrasts Fit: aov(formula = y ~ x, data = MD) Quantile = 2.667 95% family-wise confidence level Linear Hypotheses: Estimate lwr upr B - A == 0 5. 2.3330 7.6670 C - A == 0 10. 7.3330 12.6670 C - B == 0 5. 2.3330 7.6670 > lmacld <- cld(lmamc) > plot(lmacld) > plot(ci.lma) > 2) Is there a (easy) way to plot, mean plus/minus 2*sd for > each substrate type ? (with > asterisks denoting significant differences ?) Not that I know of, though the multcomp tables and plots yield logically equivalent results and plots. Writing a few lines of code to accomplish your graph is fairly straightforward. HTH Steven McKinney > > > THANKS ! > > version > platform x86_64-apple-darwin9.8.0 > arch x86_64 > os darwin9.8.0 > system x86_64, darwin9.8.0 > status > major 2 > minor 11.1 > year 2010 > month 05 > day31 > svn rev52157 > language R > version.string R version 2.11.1 (2010-05-31) > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fast version of Fisher's Exact Test
Depends on how many other programs, and how large they are, and how much RAM you have on your machine. If I repeatedly run the example I used below, my R session shows 170MB of memory usage, not a huge amount relative to total memory, and not a huge amount even for 32 bit R. But if your system has 2 GB of RAM and 1.9 GB is consumed by other processes, then this example will cause swapping and speed will be reduced. So figuring out a solution requires understanding what it is that is causing the slowdown - not enough RAM, other programs competing for CPU cycles... You can try switching to 64 bit R but unless your 32 bit R is loading some large data objects, leaving little RAM, you won't see much of a difference. If you start R, and do rm(list = ls()) to ensure no big data objects are using up RAM, does the example below still take a long time? You haven't mentioned what operating system you are using, how much RAM you have or what sessionInfo() reports on your machine. That information will help to figure this out. Steven McKinney From: Jim Silverton [jim.silver...@gmail.com] Sent: April 9, 2011 9:21 AM To: Steven McKinney Subject: Re: [R] Fast version of Fisher's Exact Test I R 32 bit installed but my machine is 64 bit. Do I need to upgrade the R to 64 bit for it to run faster? On Fri, Apr 8, 2011 at 6:44 PM, Steven McKinney mailto:smckin...@bccrc.ca>> wrote: Do you mean a test something such as this? > fisher.test(matrix(c(502,498,490, 510), nrow = 2)) Fisher's Exact Test for Count Data data: matrix(c(502, 498, 490, 510), nrow = 2) p-value = 0.6228 alternative hypothesis: true odds ratio is not equal to 1 95 percent confidence interval: 0.8770113 1.2550998 sample estimates: odds ratio 1.049119 This runs quickly on my machine. > system.time(fisher.test(matrix(c(502,498,490, 510), nrow = 2))) user system elapsed 0.008 0.001 0.010 > sessionInfo() R version 2.12.2 (2011-02-25) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] en_CA.UTF-8/en_CA.UTF-8/C/C/en_CA.UTF-8/en_CA.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.12.2 > Can you provide an example that is running slowly for you? Steven McKinney From: r-help-boun...@r-project.org<mailto:r-help-boun...@r-project.org> [r-help-boun...@r-project.org<mailto:r-help-boun...@r-project.org>] On Behalf Of Jim Silverton [jim.silver...@gmail.com<mailto:jim.silver...@gmail.com>] Sent: April 8, 2011 9:43 AM To: r-help@r-project.org<mailto:r-help@r-project.org> Subject: Re: [R] Fast version of Fisher's Exact Test Is anyone aware of a fast way of doing fisher's exact test for a series of 2 x 2 tables in R? The fisher.test is really slow if n1=1000 and n2 = 1000. -- Thanks, Jim. [[alternative HTML version deleted]] __ R-help@r-project.org<mailto:R-help@r-project.org> mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Thanks, Jim. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fast version of Fisher's Exact Test
Do you mean a test something such as this? > fisher.test(matrix(c(502,498,490, 510), nrow = 2)) Fisher's Exact Test for Count Data data: matrix(c(502, 498, 490, 510), nrow = 2) p-value = 0.6228 alternative hypothesis: true odds ratio is not equal to 1 95 percent confidence interval: 0.8770113 1.2550998 sample estimates: odds ratio 1.049119 This runs quickly on my machine. > system.time(fisher.test(matrix(c(502,498,490, 510), nrow = 2))) user system elapsed 0.008 0.001 0.010 > sessionInfo() R version 2.12.2 (2011-02-25) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale: [1] en_CA.UTF-8/en_CA.UTF-8/C/C/en_CA.UTF-8/en_CA.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.12.2 > Can you provide an example that is running slowly for you? Steven McKinney From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of Jim Silverton [jim.silver...@gmail.com] Sent: April 8, 2011 9:43 AM To: r-help@r-project.org Subject: Re: [R] Fast version of Fisher's Exact Test Is anyone aware of a fast way of doing fisher's exact test for a series of 2 x 2 tables in R? The fisher.test is really slow if n1=1000 and n2 = 1000. -- Thanks, Jim. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Where is the tcltk package?
D'oh! Sorry for that noise. You also need bwidget and Tktable installed on your linux system, so you may need to reinstall R after getting bwidget and other tcl/tk bits installed. Are you using a tool such as "yum" to install linux software, or installing yourself from source or binaries? Steven McKinney From: Rolf Turner [r.tur...@auckland.ac.nz] Sent: April 7, 2011 5:38 PM To: Steven McKinney Cc: r-help@r-project.org Subject: Re: [R] Where is the tcltk package? On 08/04/11 12:31, Steven McKinney wrote: > Searching for tcltk on the BioConductor site yields > > CRAN - Package tcltk2 - http://cran.r-project.org/web/packages/tcltk2/ > CRAN - Package tcltk2 tcltk2: Tcl/Tk Additions A series of additional > Tcl commands and Tk widgets with style and various functions (under Windows: > DDE exchange, access to the registry > > If you install tcltk2, is tcltk then available? > I ***can't*** install tcltk2, as I said in my original post. It requires that tcltk be available. cheers, Rolf Turner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Where is the tcltk package?
Searching for tcltk on the BioConductor site yields CRAN - Package tcltk2 - http://cran.r-project.org/web/packages/tcltk2/ CRAN - Package tcltk2 tcltk2: Tcl/Tk Additions A series of additional Tcl commands and Tk widgets with style and various functions (under Windows: DDE exchange, access to the registry If you install tcltk2, is tcltk then available? Steven McKinney From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of Rolf Turner [r.tur...@auckland.ac.nz] Sent: April 7, 2011 5:15 PM To: r-help@r-project.org Subject: [R] Where is the tcltk package? Perhaps I'm being even thicker than usual, but I can't find the tcltk package on CRAN. There is a tcltk2 package, which says that it is a collection of supplements to tcltk, but I cannot see a just-plain tcltk anywhere. If I try to install tcltk2 (from the Linux command line, or using install.packages() in R) it complains that it needs tcltk. If I try to install tcltk using install.packages() it tells me that this package is not available. What is going on? cheers, Rolf Turner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ANCOVA for linear regressions without intercept
Hi Yusuke, Does the following get what you are after? ### Make some test data. > set.seed(123) > edf <- data.frame(sex = c(rep("Male", 10), rep("Female", 10), rep("Unknown", > 10)), + head_length = c(1.2 * c(170:179 + rnorm(10)), 0.8 * c(150:159 + rnorm(10)), c(160:169 + rnorm(10)))/10, + body_length = c(c(170:179 + rnorm(10)), c(150:159 + rnorm(10)), c(160:169 + rnorm(10))) + ) > edf$sex <- factor(as.character(edf$sex)) > plot(edf$head_length, edf$body_length, pch = as.numeric(edf$sex), col = > as.numeric(edf$sex), xlim = c(0, 25), ylim = c(0, 190)) > lmf <- lm(body_length ~ head_length * sex, data = edf) ### The full model - do keep an eye on those intercepts and try to ensure they are not far from 0. > summary(lmf) Call: lm(formula = body_length ~ head_length * sex, data = edf) Residuals: Min 1Q Median 3Q Max -2.73783 -0.68133 0.02147 0.50858 2.38931 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -3.578 25.425 -0.141 0.8893 head_length 12.772 2.054 6.2182e-06 *** sexMale 15.122 37.464 0.404 0.6901 sexUnknown 40.308 33.137 1.216 0.2357 head_length:sexMale -4.977 2.438 -2.042 0.0523 . head_length:sexUnknown -4.971 2.428 -2.047 0.0517 . --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 1.384 on 24 degrees of freedom Multiple R-squared: 0.9802, Adjusted R-squared: 0.9761 F-statistic: 237.7 on 5 and 24 DF, p-value: < 2.2e-16 ### Now suppress intercepts. head_length:sex should give interactions (slopes) only. > lmrf <- lm(body_length ~ -1 + head_length : sex, data = edf) > summary(lmrf) Call: lm(formula = body_length ~ -1 + head_length:sex, data = edf) Residuals: Min 1Q Median 3Q Max -3.02782 -0.61861 -0.01079 0.68785 2.57544 Coefficients: Estimate Std. Error t value Pr(>|t|) head_length:sexFemale 12.482530.03549 351.8 <2e-16 *** head_length:sexMale 8.345000.02097 398.0 <2e-16 *** head_length:sexUnknown 10.038440.02677 375.0 <2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 1.389 on 27 degrees of freedom Multiple R-squared: 0., Adjusted R-squared: 0. F-statistic: 1.409e+05 on 3 and 27 DF, p-value: < 2.2e-16 ### Check the numeric coding of the factor > with(edf, table(sex, as.numeric(sex))) sex1 2 3 Female 10 0 0 Male 0 10 0 Unknown 0 0 10 > abline(a = 0, b = coef(lmrf)[1], col = 1) ## Females = Black > abline(a = 0, b = coef(lmrf)[2], col = 2) ## Males = Red > abline(a = 0, b = coef(lmrf)[3], col = 3) ## Unknown = Green ### If no diff between males and females, then males and females can be combined into one group. > edf$MvF <- as.character(edf$sex) > edf$MvF[edf$MvF != "Unknown"] <- "MorF" > edf$MvF <- factor(edf$MvF) > with(edf, table(MvF, sex)) sex MvF Female Male Unknown MorF10 10 0 Unknown 00 10 > lmr1f <- lm(body_length ~ -1 + head_length : MvF, data = edf) > summary(lmr1f) Call: lm(formula = body_length ~ -1 + head_length:MvF, data = edf) Residuals: Min 1Q Median 3Q Max -23.976 -21.656 0.077 35.899 39.839 Coefficients: Estimate Std. Error t value Pr(>|t|) head_length:MvFMorF 9.4156 0.3429 27.46 <2e-16 *** head_length:MvFUnknown 10.0384 0.5085 19.74 <2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 26.39 on 28 degrees of freedom Multiple R-squared: 0.9761, Adjusted R-squared: 0.9744 F-statistic: 571.9 on 2 and 28 DF, p-value: < 2.2e-16 ### Test the hypothesis that male and female heights are equivalent > anova(lmr1f, lmrf) Analysis of Variance Table Model 1: body_length ~ -1 + head_length:MvF Model 2: body_length ~ -1 + head_length:sex Res.Df RSS Df Sum of Sq FPr(>F) 1 28 19496.1 2 2752.1 1 19444 10077 < 2.2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ### Plot the reduced model regression lines > abline(a = 0, b = coef(lmr1f)[1], col = "blue", lty = 2) > abline(a = 0, b = coef(lmr1f)[2], col = "orange", lty = 2, lwd = 4) > The other two tests can be set up and run similarly. Don't forget to adjust fo
Re: [R] Linear Model with curve fitting parameter?
> -Original Message- > From: stephen sefick [mailto:ssef...@gmail.com] > Sent: April-04-11 2:49 PM > To: Steven McKinney > Subject: Re: [R] Linear Model with curve fitting parameter? > > Steven: > > I am really sorry for my confusion. I hope this now makes sense. > > b0 == y intercept == y-intercept == (intercept) fit by lm > > a <- 1:10 > b <- 1:10 > > summary(lm(a~b)) > #to show what I was calling b0 > > So... > > > manning > > Q = K*A*(R^b2)*(S^b3) > > log(Q) = log(K)+log(A)+(b2*log(R))+(b3*log(S)) Okay, using this notation, this appears to be the original model you queried about. So for this model, as I showed before, Let Z = log(Q) - log(A) E(Z) = b0 + b2*log(R) + b3*log(S) = log(K) + b2*log(R) + b3*log(S) Fitting the model lm(Z ~ log(R) + log(S)) will yield parameter estimates b_hat_0, b_hat_2, b_hat_3 where b_hat_0 (the fitted model intercept) is an estimate of b0 (which is log(K)), b_hat_2 is an estimate of b2, b_hat_3 is an estimate of b3. So in answer to your previous question, b0 is an estimate of log(K), not ( log(Qintercept)+log(K) ) so an estimate for K is exp(b_hat_0) > > > dingman > Q = K*(A^b1)*(R^b2)*(S^b3*log(S)) > > log(Q) = log(K)+(b1*log(A))+(b2*log(R))+(b3*(log(S))^2) The dingman model notation is ambiguous. Is the last term S^(b3*log(S)) or (S^b3)*log(S) ? Previous email showed > dingman > log(Q)=log(b0)+log(K)+a*log(A)+r*log(R)+s*(log(S))^2 which implies (if I ignore the log(b0) term) Q = K*(A^a)*(R^r)*(exp(log(S)*log(S))^s) = K*(A^a)*(R^r)*(S^(log(S)*s)) This is linearizable as log(Q) = log(K) + a*log(A) + r*log(R) + s*(log(S))^2 = b0 + b1*log(A) + b2*log(R) + b3*(log(S)^2) Fitting lm(log(Q) ~ log(A) + log(R) + I(log(S)^2) ... ) will yield estimates b_hat_0, b_hat_1, b_hat_2 and b_hat_3 where b_hat_0 is an estimate of b0 = log(K) so an estimate of K is exp(b_hat_0), b_hat_1 is an estimate of b1 = a, b_hat_2 is an estimate of b2 = r, b_hat_3 is an estimate of b3 = s > > > > Bjerklie > > Q = K*(A^b1)*(R^b2)*(S^b3) > > log(Q) = log(K)+(b1*log(A))+(b2*log(R))*(b3*log(S)) Fitting lm(log(Q) ~ log(A) + log(R) + log(S) ... ) will yield estimates b_hat_0, b_hat_1, b_hat_2 and b_hat_3 where b_hat_0 is an estimate of b0 = log(K) so an estimate of K is exp(b_hat_0), b_hat_1 is an estimate of b1 = a, b_hat_2 is an estimate of b2 = r, b_hat_3 is an estimate of b3 = s Best Steve McKinney > > > > > > > > On Mon, Apr 4, 2011 at 2:58 PM, Steven McKinney wrote: > > > >> -Original Message- > >> From: stephen sefick [mailto:ssef...@gmail.com] > >> Sent: April-03-11 5:35 PM > >> To: Steven McKinney > >> Cc: R help > >> Subject: Re: [R] Linear Model with curve fitting parameter? > >> > >> Steven: > >> > >> You are exactly right sorry I was confused. > >> > >> > >> ### > >> so log(y-intercept)+log(K) is a constant called b0 (is this right?) > > > > Doesn't look right to me based on the information you've provided. > > I don't see anything labeled "y" in your previous emails, so I'm > > not clear on what y is and how it relates to the original model > > you described > > > > > >> I have a model Q=K*A*(R^r)*(S^s) > > > >> > > > >> A, R, and S are data I have and K is a curve fitting parameter. > > > > If the model is > > > > Q=K*A*(R^r)*(S^s) > > > > then > > > > log(Q) = log(K) + log(A) + r*log(R) + s*log(S) > > > > Rearranging yields > > > > log(Q) - log(A) = log(K) + r*log(R) + s*log(S) > > > > Let Z = log(Q) - log(A) = log(Q/A) > > > > so > > > > Z = log(K) + r*log(R) + s*log(S) > > > > and a linear model fit of > > > > Z ~ log(R) + log(S) > > > > will yield parameter estimates for the linear equation > > > > E(Z) = B0 + B1*log(R) + B2*log(S) > > > > (E(Z) = expected value of Z) > > > > so B0 estimate is an estimate of log(K) > > B1 estimate is an estimate of r > > B2 estimate is an estimate of s > > > > and these are the only parameters you described in the original model. > > > > > >> > >> lm(log(Q)~log(A)+log(R)+log(S)-1) > >> > >> is fitting the model > >> > >> log(Q)=a*log(A)+r*log(R)+s*log(S) (n
Re: [R] Linear Model with curve fitting parameter?
> -Original Message- > From: stephen sefick [mailto:ssef...@gmail.com] > Sent: April-03-11 5:35 PM > To: Steven McKinney > Cc: R help > Subject: Re: [R] Linear Model with curve fitting parameter? > > Steven: > > You are exactly right sorry I was confused. > > > ### > so log(y-intercept)+log(K) is a constant called b0 (is this right?) Doesn't look right to me based on the information you've provided. I don't see anything labeled "y" in your previous emails, so I'm not clear on what y is and how it relates to the original model you described > >> I have a model Q=K*A*(R^r)*(S^s) > >> > >> A, R, and S are data I have and K is a curve fitting parameter. If the model is Q=K*A*(R^r)*(S^s) then log(Q) = log(K) + log(A) + r*log(R) + s*log(S) Rearranging yields log(Q) - log(A) = log(K) + r*log(R) + s*log(S) Let Z = log(Q) - log(A) = log(Q/A) so Z = log(K) + r*log(R) + s*log(S) and a linear model fit of Z ~ log(R) + log(S) will yield parameter estimates for the linear equation E(Z) = B0 + B1*log(R) + B2*log(S) (E(Z) = expected value of Z) so B0 estimate is an estimate of log(K) B1 estimate is an estimate of r B2 estimate is an estimate of s and these are the only parameters you described in the original model. > > lm(log(Q)~log(A)+log(R)+log(S)-1) > > is fitting the model > > log(Q)=a*log(A)+r*log(R)+s*log(S) (no beta 0) > > and > > lm(log(Q)~log(A)+log(R)+log(S)) > > > is fitting the model > > log(Q)=b0+a*log(A)+r*log(R)+s*log(S) K has disappeared from these equations so these model fits do not correspond to the model originally described. Now a b0 appears, and is used in models below. I think changing notation is also adding confusion. What are "y" and "intercept" you discuss above, in relation to your original notation? > > ## > > These are the models I am trying to fit and if I have reasoned > correctly above then I should be able to fit the below models > similarly. You will be able to fit models appropriately once you have a clearly defined system of notation that allows you to map between the proposed data model, the parameters in that model, and the corresponding regression equations. Once you have consistent notation, you will be able to see if you can express your model as a linear regression, or if not, what kind of non-linear regression you will need to do to get estimates for the parameters in your model. Best Steve McKinney > > manning > log(Q)=log(b0)+log(K)+log(A)+r*log(R)+s*log(S) > > dingman > log(Q)=log(b0)+log(K)+a*log(A)+r*log(R)+s*(log(S))^2 > > bjerklie > log(Q)=log(b0)+log(K)+a*log(A)+r*log(R)+s*log(S) > > ### > > Thank you for all of your help! > > Stephen > > On Fri, Apr 1, 2011 at 2:44 PM, Steven McKinney wrote: > > > >> -Original Message- > >> From: stephen sefick [mailto:ssef...@gmail.com] > >> Sent: April-01-11 5:44 AM > >> To: Steven McKinney > >> Cc: R help > >> Subject: Re: [R] Linear Model with curve fitting parameter? > >> > >> Setting Z=Q-A would be the incorrect dimensions. I could Z=Q/A. > > > > I suspect this is confusion about what Q is. I was presuming that > > the Q in this following formula was log(Q) with Q from the original data. > > > >> >> I have taken the log of the data that I have and this is the model > >> >> formula without the K part > >> >> > >> >> lm(Q~offset(A)+R+S, data=x) > > > > If the model is > > > > Q=K*A*(R^r)*(S^s) > > > > then > > > > log(Q) = log(K) + log(A) + r*log(R) + s*log(S) > > > > Rearranging yields > > > > log(Q) - log(A) = log(K) + r*log(R) + s*log(S) > > > > so what I labeled 'Z' below is > > > > Z = log(Q) - log(A) = log(Q/A) > > > > so > > > > Z = log(K) + r*log(R) + s*log(S) > > > > and a linear model fit of > > > > Z ~ log(R) + log(S) > > > > will yield parameter estimates for the linear equation > > > > E(Z) = B0 + B1*log(R) + B2*log(S) > > > > (E(Z) = expected value of Z) > > > > so B0 estimate is an estimate of log(K) > > B1 estimate is an estimate of r > > B2 estimate is an estimate of s > > > > More details and careful notation will eventually lead > > to a reasonable description and analysis strategy. > > > > > &g
Re: [R] Linear Model with curve fitting parameter?
> -Original Message- > From: stephen sefick [mailto:ssef...@gmail.com] > Sent: April-01-11 5:44 AM > To: Steven McKinney > Cc: R help > Subject: Re: [R] Linear Model with curve fitting parameter? > > Setting Z=Q-A would be the incorrect dimensions. I could Z=Q/A. I suspect this is confusion about what Q is. I was presuming that the Q in this following formula was log(Q) with Q from the original data. > >> I have taken the log of the data that I have and this is the model > >> formula without the K part > >> > >> lm(Q~offset(A)+R+S, data=x) If the model is Q=K*A*(R^r)*(S^s) then log(Q) = log(K) + log(A) + r*log(R) + s*log(S) Rearranging yields log(Q) - log(A) = log(K) + r*log(R) + s*log(S) so what I labeled 'Z' below is Z = log(Q) - log(A) = log(Q/A) so Z = log(K) + r*log(R) + s*log(S) and a linear model fit of Z ~ log(R) + log(S) will yield parameter estimates for the linear equation E(Z) = B0 + B1*log(R) + B2*log(S) (E(Z) = expected value of Z) so B0 estimate is an estimate of log(K) B1 estimate is an estimate of r B2 estimate is an estimate of s More details and careful notation will eventually lead to a reasonable description and analysis strategy. Best Steve McKinney > Is fitting a nls model the same as fitting an ols? These data are > hydraulic data from ~47 sites. To access predictive ability I am > removing one site fitting a new model and then accessing the fit with > a myriad of model assessment criteria. I should get the same answer > with ols vs nls? Thank you for all of your help. > > Stephen > > On Thu, Mar 31, 2011 at 8:34 PM, Steven McKinney wrote: > > > >> -Original Message- > >> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] > >> On Behalf Of stephen > sefick > >> Sent: March-31-11 3:38 PM > >> To: R help > >> Subject: [R] Linear Model with curve fitting parameter? > >> > >> I have a model Q=K*A*(R^r)*(S^s) > >> > >> A, R, and S are data I have and K is a curve fitting parameter. I > >> have linearized as > >> > >> log(Q)=log(K)+log(A)+r*log(R)+s*log(S) > >> > >> I have taken the log of the data that I have and this is the model > >> formula without the K part > >> > >> lm(Q~offset(A)+R+S, data=x) > >> > >> What is the formula that I should use? > > > > Let Z = Q - A for your logged data. > > > > Fitting lm(Z ~ R + S, data = x) should yield > > intercept parameter estimate = estimate for log(K) > > R coefficient parameter estimate = estimate for r > > S coefficient parameter estimate = estimate for s > > > > > > > > Steven McKinney > > > > Statistician > > Molecular Oncology and Breast Cancer Program > > British Columbia Cancer Research Centre > > > > > > > >> > >> Thanks for all of your help. I can provide a subset of data if necessary. > >> > >> > >> > >> -- > >> Stephen Sefick > >> > >> | Auburn University | > >> | Biological Sciences | > >> | 331 Funchess Hall | > >> | Auburn, Alabama | > >> | 36849 | > >> |___| > >> | sas0...@auburn.edu | > >> | http://www.auburn.edu/~sas0025 | > >> |___| > >> > >> Let's not spend our time and resources thinking about things that are > >> so little or so large that all they really do for us is puff us up and > >> make us feel like gods. We are mammals, and have not exhausted the > >> annoying little problems of being mammals. > >> > >> -K. Mullis > >> > >> "A big computer, a complex algorithm and a long time does not equal > >> science." > >> > >> -Robert Gentleman > >> __ > >> R-help@r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > > > > > > -- > St
Re: [R] Linear Model with curve fitting parameter?
> -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf Of stephen sefick > Sent: March-31-11 3:38 PM > To: R help > Subject: [R] Linear Model with curve fitting parameter? > > I have a model Q=K*A*(R^r)*(S^s) > > A, R, and S are data I have and K is a curve fitting parameter. I > have linearized as > > log(Q)=log(K)+log(A)+r*log(R)+s*log(S) > > I have taken the log of the data that I have and this is the model > formula without the K part > > lm(Q~offset(A)+R+S, data=x) > > What is the formula that I should use? Let Z = Q - A for your logged data. Fitting lm(Z ~ R + S, data = x) should yield intercept parameter estimate = estimate for log(K) R coefficient parameter estimate = estimate for r S coefficient parameter estimate = estimate for s Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre > > Thanks for all of your help. I can provide a subset of data if necessary. > > > > -- > Stephen Sefick > > | Auburn University | > | Biological Sciences | > | 331 Funchess Hall | > | Auburn, Alabama | > | 36849 | > |___| > | sas0...@auburn.edu | > | http://www.auburn.edu/~sas0025 | > |___| > > Let's not spend our time and resources thinking about things that are > so little or so large that all they really do for us is puff us up and > make us feel like gods. We are mammals, and have not exhausted the > annoying little problems of being mammals. > > -K. Mullis > > "A big computer, a complex algorithm and a long time does not equal science." > > -Robert Gentleman > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] xlsx problem
Hi Cristoph, Glad to hear it worked - replying to r-help in case others face this issue. Cheers Steven McKinney From: Martin Knapp [mkna...@aucklanduni.ac.nz] Sent: March 28, 2011 7:28 PM To: Steven McKinney Subject: Re: [R] xlsx problem Thanks that worked. still don't know what was wrong though but I can live with that. Christoph On 29 March 2011 14:16, Steven McKinney wrote: > Dear Christoph > > Are there formulas in any cells of sheet 6? > > One thing you might try is to open up the spreadsheet in Excel and create a > new sheet, > call it Sheet 7. > > In Sheet 6, select all the cells with data (or just click the > upper left icon at the border of the spreadsheet that selects the whole > sheet, -a will also do this), then go to sheet 7 and > do a Paste Special - Values > > If you just paste the values, then save the file and read Sheet 7, does that > work? > > > While you're in the spreadsheet, see if you can spot any odd looking entries > on sheet 6. > Excel lets people put just about anything, just about anywhere. > > Steven McKinney > > From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf > Of Martin Knapp [mkna...@aucklanduni.ac.nz] > Sent: March 28, 2011 1:43 PM > To: r-help > Subject: [R] xlsx problem > > Dear list, > I'm running windows xp with R 2.12.0. I'm trying to load a excel > spreadsheet into R using the xlsx package. I posted my code below with > the error I get. > >> res <- read.xlsx("Copy of test_excel_input_data.xlsx", 6) > Error in .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl, : > java.lang.IllegalStateException: Cannot get a numeric value from a text cell > > There are 6 sheets in that file. 5 of them are read into a dataframe > without problem. For the 6th one(the one I actually need) I get this > error message. > There is obviously something in the format of the sheet wrong. Has > anyone an idea whats going wrong and what I have to change in the > sheet? > > Thanks in advance, > > Christoph > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Points on Map
Here's a start... > require("maps") > foo <- map("state", "new york") > lines(x = range(foo$x, na.rm = TRUE), y = range(foo$y, na.rm = TRUE)) You can figure out from this how to specify the coordinates that you want for dividing up the map, put them in a file etc. Steven McKinney From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of Jaimin Dave [davejaim...@gmail.com] Sent: March 28, 2011 11:12 AM To: r-help@r-project.org Subject: [R] Points on Map Hi, I am new to R and I want to plot points on the Map of New York . I also want to divide map of New York into small grids(not fixed) .I want that these point should be plotted from a file.How can I do it?Any help would be greatly appreciated. Thanks Jaimin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rep for multiple categories
Hi Kathi, Are you looking for something like this? > Alist <- letters[1:4] > Blist <- 1:5 > Clist <- LETTERS[24:26] > Alist [1] "a" "b" "c" "d" > Blist [1] 1 2 3 4 5 > Clist [1] "X" "Y" "Z" > expand.grid(Alist, Blist, Clist) Var1 Var2 Var3 1 a1X 2 b1X 3 c1X 4 d1X 5 a2X 6 b2X 7 c2X 8 d2X 9 a3X 10b3X 11c3X 12d3X 13a4X 14b4X 15c4X 16d4X 17a5X 18b5X 19c5X 20d5X 21a1Y 22b1Y 23c1Y 24d1Y 25a2Y 26b2Y 27c2Y 28d2Y 29a3Y 30b3Y 31c3Y 32d3Y 33a4Y 34b4Y 35c4Y 36d4Y 37a5Y 38b5Y 39c5Y 40d5Y 41a1Z 42b1Z 43c1Z 44d1Z 45a2Z 46b2Z 47c2Z 48d2Z 49a3Z 50b3Z 51c3Z 52d3Z 53a4Z 54b4Z 55c 4Z 56d4Z 57a5Z 58b5Z 59c5Z 60d5Z Steven McKinney From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of BORGMANN,Kathi [kborgm...@audubon.org] Sent: March 28, 2011 11:43 AM To: r-help@r-project.org Subject: [R] rep for multiple categories Hi, I am R beginner and am trying to figure out how to generate a complete list of species for every point, visit, and year. The code below is close but does not give me a list of species for every point, visit, and year in my data set. spplist<-unique(sumPtCt$Species) spplength<-length(spplist) Pointlist<-unique(sumPtCt$Point) Pointlength<-length(Pointlist) Visitlist<-unique(sumPtCt$Visit) Visitlength<-length(Visitlist) Yearlist<-unique(sumPtCt$Year) Yearlength<-length(Yearlist) s<-rep(spplist, each=Pointlength, Visitlength, Yearlength) p<-rep(Pointlist, spplength) v<-rep(Visitlist, spplength) y<-rep(Yearlist, spplength) template<-data.frame(Species=s,Point=p, Visit=v, Year=y) ###merge template and data and replace NAs with 0 FinalPtCt<-merge(template, sumPtCt, all=T) FinalPtCt$Number[is.na(FinalPtCt$Number)]<-0 Essentially I have data that look like this SPP Point Visit Number BUFF 1 1 5 WEGR 1 1 10 CLGR 1 1 15 WEGU 2 15 RUDU 2 1 15 HOGR 2 15 But I want to generate this Spp Point Visit Number BUFF 1 1 5 WEGR 1 1 10 CLGR 1 1 15 WEGU 1 1 0 RUDU 1 1 0 HOGR 1 1 0 WEGU 2 1 5 RUDU 2 1 15 HOGR 2 1 5 BUFF 2 1 0 WEGR 2 1 0 CLGR 2 1 0 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] xlsx problem
Dear Christoph Are there formulas in any cells of sheet 6? One thing you might try is to open up the spreadsheet in Excel and create a new sheet, call it Sheet 7. In Sheet 6, select all the cells with data (or just click the upper left icon at the border of the spreadsheet that selects the whole sheet, -a will also do this), then go to sheet 7 and do a Paste Special - Values If you just paste the values, then save the file and read Sheet 7, does that work? While you're in the spreadsheet, see if you can spot any odd looking entries on sheet 6. Excel lets people put just about anything, just about anywhere. Steven McKinney From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of Martin Knapp [mkna...@aucklanduni.ac.nz] Sent: March 28, 2011 1:43 PM To: r-help Subject: [R] xlsx problem Dear list, I'm running windows xp with R 2.12.0. I'm trying to load a excel spreadsheet into R using the xlsx package. I posted my code below with the error I get. > res <- read.xlsx("Copy of test_excel_input_data.xlsx", 6) Error in .jcall("RJavaTools", "Ljava/lang/Object;", "invokeMethod", cl, : java.lang.IllegalStateException: Cannot get a numeric value from a text cell There are 6 sheets in that file. 5 of them are read into a dataframe without problem. For the 6th one(the one I actually need) I get this error message. There is obviously something in the format of the sheet wrong. Has anyone an idea whats going wrong and what I have to change in the sheet? Thanks in advance, Christoph __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Dont show zero values in line graph
How about this? > x<-c(1:5,NA,NA,8:10) > y<-1:10 > plot(0,0,xlim=c(0,10), ylim=c(0,10),type="n",main="Dont show the bloody 0 > values!!") > lines(x~y, col="blue", lwd=2, subset = !is.na(x)) NAs let you do lots of useful manipulations in R. Steven McKinney From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of LCOG1 [jr...@lcog.org] Sent: January 6, 2011 7:10 PM To: r-help@r-project.org Subject: [R] Dont show zero values in line graph Hey everyone, Im getting better at plotting my data but cant for the life of me figure out how to show a line graph with missing data that doesnt continue the line down to zero then back up to the remaining values. Consider the following x<-c(1:5,0,0,8:10) y<-1:10 plot(0,0,xlim=c(0,10), ylim=c(0,10),type="n",main="Dont show the bloody 0 values!!") lines(x~y, col="blue", lwd=2,) My data is missing the 6th and 7th values and they come in as NA's so i change them to 0s but then the plot has these ugly lines that dive toward the x axis then back up. I would do bar plots but i need to show multiple sets of data on the same and side by side bars doesnt do it for me. So i need a line graph that starts and stops where 0s or missing values exist. Thoughts? JR -- View this message in context: http://r.789695.n4.nabble.com/Dont-show-zero-values-in-line-graph-tp3178566p3178566.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Schedule R to run automatically
> -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf Of joeponzio > Sent: December-13-10 4:13 PM > To: r-help@r-project.org > Subject: [R] Schedule R to run automatically > > > Current I have raw data from several projects that is written to a drive on a > daily basis. I would like to run a certain R syntax against these data sets > (varying somewhat from data set to data set) every day at, say, 7am (with no > human interaction). Is this possible to set up in windows or unix? Yes, in both. In Windows, you can use the "Task Scheduler" (Control Panel - Administrative Tools on Windows 7). Set up the R script you want to run daily, and invoke it with the R CMD BATCH command. In Unix, you can use the cron scheduler to do the same. Steven McKinney > > Thanks for any input. > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/Schedule-R-to-run-automatically- > tp3086272p3086272.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Tukey Test, lme, error: less than two groups
Just to close this thread, Lilith provided the data which was in a .csv text file and had multiple lines of blank data at the end species;code;treatment;pretreatment;provenance;greenhouse;individual;leaf;Date;DataPAM Ae;c-ae-1-1-3;C;C;1;1;3;1;25.05.10 14:00; 0.665 . . . Ae;w-ae-6-3-4;C;W;6;3;4;3;23.06.10 12:30; 0.622 ; ; ; ; ; ; ; ; ; ; ; Removing the blank (NA) rows of data and keeping all variables in the dataframe resolves the issue. (Free floating variables yield this error > summary(glht(PAM.lme, linfct = mcp(Provenancef = "Tukey"))) Error in `[.data.frame`(mf, nhypo[checknm]) : undefined columns selected) Steven McKinney From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of Steven McKinney [smckin...@bccrc.ca] Sent: December 2, 2010 2:03 PM To: 'Lilith'; r-help@r-project.org Subject: Re: [R] Tukey Test, lme, error: less than two groups Comments in-line below > -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf Of Lilith > Sent: December-02-10 9:39 AM > To: r-help@r-project.org > Subject: [R] Tukey Test, lme, error: less than two groups > > > Dear R-group, > > I am trying desperately to get this Tukey test working. Its my first time > here so I hope my question is not too stupid, but I couldn't find anything > helpful in the help or in the forum. > > I am analysing a dataset of treated grasses. I want to find out, if the > grasses from different Provenances react differently. > In the aov test I found a significance for the combination Treatment and > provenance: > > summary(PAMaov<-aov(PAMval~Treatmentf*Pretreatmentf*Provenancef+Error(Datef/Code))) > > Treatmentf:Provenancef p-value: 0.008023 ** > > In the Linear fixed effects model lme, I can see that there is a > significance for two provenances (HU and ES) > > summary(PAM.lme<-lme(PAMval~Treatmentf*Provenancef*Pretreatmentf, random= > ~1|Datef/Code,na.action=na.omit)) > > Value Std.Error DF t-value > p-value > (Intercept) 0.6890317 0.06117401 994 11.263473 > 0. > TreatmentfF -0.2897619 0.05484590 467 -5.283201 > 0. > ProvenancefDE 0.0105873 0.05484590 467 0.193037 > 0.8470 > > TreatmentfF:ProvenancefES 0.1647302 0.08226884 467 2.002340 > 0.0458 > TreatmentfF:ProvenancefHU 0.1569524 0.07756381 467 2.023526 > 0.0436 > > No the big mystery is the Tukey test. I just can't find the mistake, it > keeps telling me, that there are " less than two groups" > > summary(glht(PAM.lme, linfct = mcp(Provenancef = "Tukey"))) > > Fehler in contrMat(table(mf[[nm]]), type = types[pm]) : > less than two groups > > I guess its important to know that I made factors out of some of the data. > Here is the code: > > > PAMdata$provenance[PAMdata$provenance == "5"] = "ES" > PAMdata$provenance[PAMdata$provenance == "6"] = "HU" > # etc. > > Treatmentf <- factor(PAMdata$treatment, levels=c("C","F")) > Datef <- factor(PAMdata$Date, levels=c( "25.05.10 14:00","26.05.10 > 19:00","27.05.2010 7:30","27.05.10 14:00","01.06.10 14:00","02.06.10 > 19:00","23.06.10 12:30"),ordered=TRUE) > > > Pretreatmentf <- as.factor(PAMdata$pretreatment) > Provenancef <- as.factor(PAMdata$provenance) > Greenhousef <- as.factor(PAMdata$greenhouse) > Individualf <- as.factor(PAMdata$individual) > > PAMval <- (PAMdata$DataPAM) > Code<-(PAMdata$code) I suspect the problem is the creation of all these individual variables. Try instead PAMdata$Treatmentf <- factor(PAMdata$treatment, levels=c("C","F")) PAMdata$Datef <- factor(PAMdata$Date, levels=c( "25.05.10 14:00","26.05.10 19:00","27.05.2010 7:30","27.05.10 14:00","01.06.10 14:00","02.06.10 19:00","23.06.10 12:30"),ordered=TRUE) ... PAMdata$PAMval <- (PAMdata$DataPAM) PAMdata$Code<-(PAMdata$code) etc. so that all of your required variables are variables in the dataframe PAMdata. When you pass off fitted model objects to additional functions, the additional functions often require access to the dataframe used in the initial modeling. Then call lme with summary(PAM.lme<-lme(PAMval~Treatmentf*Provenancef*Pretreatmentf, random= ~1|Datef/Code, data = PAMdata, na.action=na.omit)) then try summary(glht(PAM
Re: [R] Tukey Test, lme, error: less than two groups
Comments in-line below > -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf Of Lilith > Sent: December-02-10 9:39 AM > To: r-help@r-project.org > Subject: [R] Tukey Test, lme, error: less than two groups > > > Dear R-group, > > I am trying desperately to get this Tukey test working. Its my first time > here so I hope my question is not too stupid, but I couldn't find anything > helpful in the help or in the forum. > > I am analysing a dataset of treated grasses. I want to find out, if the > grasses from different Provenances react differently. > In the aov test I found a significance for the combination Treatment and > provenance: > > summary(PAMaov<-aov(PAMval~Treatmentf*Pretreatmentf*Provenancef+Error(Datef/Code))) > > Treatmentf:Provenancef p-value: 0.008023 ** > > In the Linear fixed effects model lme, I can see that there is a > significance for two provenances (HU and ES) > > summary(PAM.lme<-lme(PAMval~Treatmentf*Provenancef*Pretreatmentf, random= > ~1|Datef/Code,na.action=na.omit)) > > Value Std.Error DF t-value > p-value > (Intercept) 0.6890317 0.06117401 994 11.263473 > 0. > TreatmentfF -0.2897619 0.05484590 467 -5.283201 > 0. > ProvenancefDE 0.0105873 0.05484590 467 0.193037 > 0.8470 > > TreatmentfF:ProvenancefES 0.1647302 0.08226884 467 2.002340 > 0.0458 > TreatmentfF:ProvenancefHU 0.1569524 0.07756381 467 2.023526 > 0.0436 > > No the big mystery is the Tukey test. I just can't find the mistake, it > keeps telling me, that there are " less than two groups" > > summary(glht(PAM.lme, linfct = mcp(Provenancef = "Tukey"))) > > Fehler in contrMat(table(mf[[nm]]), type = types[pm]) : > less than two groups > > I guess its important to know that I made factors out of some of the data. > Here is the code: > > > PAMdata$provenance[PAMdata$provenance == "5"] = "ES" > PAMdata$provenance[PAMdata$provenance == "6"] = "HU" > # etc. > > Treatmentf <- factor(PAMdata$treatment, levels=c("C","F")) > Datef <- factor(PAMdata$Date, levels=c( "25.05.10 14:00","26.05.10 > 19:00","27.05.2010 7:30","27.05.10 14:00","01.06.10 14:00","02.06.10 > 19:00","23.06.10 12:30"),ordered=TRUE) > > > Pretreatmentf <- as.factor(PAMdata$pretreatment) > Provenancef <- as.factor(PAMdata$provenance) > Greenhousef <- as.factor(PAMdata$greenhouse) > Individualf <- as.factor(PAMdata$individual) > > PAMval <- (PAMdata$DataPAM) > Code<-(PAMdata$code) I suspect the problem is the creation of all these individual variables. Try instead PAMdata$Treatmentf <- factor(PAMdata$treatment, levels=c("C","F")) PAMdata$Datef <- factor(PAMdata$Date, levels=c( "25.05.10 14:00","26.05.10 19:00","27.05.2010 7:30","27.05.10 14:00","01.06.10 14:00","02.06.10 19:00","23.06.10 12:30"),ordered=TRUE) ... PAMdata$PAMval <- (PAMdata$DataPAM) PAMdata$Code<-(PAMdata$code) etc. so that all of your required variables are variables in the dataframe PAMdata. When you pass off fitted model objects to additional functions, the additional functions often require access to the dataframe used in the initial modeling. Then call lme with summary(PAM.lme<-lme(PAMval~Treatmentf*Provenancef*Pretreatmentf, random= ~1|Datef/Code, data = PAMdata, na.action=na.omit)) then try summary(glht(PAM.lme, linfct = mcp(Provenancef = "Tukey"))) again. (If you still get an error, run the traceback() command and provide that information.) I'm also wondering why no term for Pretreatmentf shows in your model output. After setting up the factor variables in the PAMdata dataframe, what does the command with(PAMdata, table(Pretreatmentf, Provenancef, Treatmentf)) show? Is Pretreatmentf even needed in the model? The output of the command sessionInfo() is also useful to help people figure out such issues. Also, if you can share the data, or a mock-up of it, others will be able to run code examples, and not just guess. HTH Steve McKinney > > Thank you for any hint! That Tukey test seems so easy, I just can't find the > mistake > Thank you very much fpr your help and greetings from Tanzania, > Lilith > > -- > View this message in context: > http://r.789695.n4.nabble.com/Tukey-Test-lme-error-less-than-two-groups- > tp3069789p3069789.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commente
Re: [R] wilcox.test; data type conversion?
You can set up the data as > grade <- ordered(c("MVG", "VG", "VG", "G", "MVG", "G", "VG", "G", "VG"), > levels = c("G", "VG", "MVG")) > grade [1] MVG VG VG G MVG G VG G VG Levels: G < VG < MVG > sex <- factor(c( "male", "male", "female", "male", "female", "male", > "female", "male", "male"), levels = c("male", "female")) > sex [1] male male female male female male female male male Levels: male female > gradesbysex <- data.frame(grade, sex) > > gradesbysex gradesex 1 MVG male 2VG male 3VG female 4 G male 5 MVG female 6 G male 7VG female 8 G male 9VG male Now for the Wilcoxon-Mann_Whitney test > wilcox.test(grade ~ sex, data = gradesbysex) Error in wilcox.test.default(x = c(3L, 2L, 1L, 1L, 1L, 2L), y = c(2L, : 'x' must be numeric I'm not sure if anyone has written a version that will work on ordered factor variables, but you can coerce the ordered factor to its underlying integer representation with e.g. > wilcox.test(as.integer(grade) ~ sex, data = gradesbysex) Wilcoxon rank sum test with continuity correction data: as.integer(grade) by sex W = 4.5, p-value = 0.2695 alternative hypothesis: true location shift is not equal to 0 Warning message: In wilcox.test.default(x = c(3L, 2L, 1L, 1L, 1L, 2L), y = c(2L, : cannot compute exact p-value with ties You can break the ties by jittering the data. Each jitter will of course produce different tie breakers. A few repeats of the test, or a loop and some summaries of the outcomes, will give you an idea of the "average" result. > wilcox.test(jitter(as.integer(grade)) ~ sex, data = gradesbysex) Wilcoxon rank sum test data: jitter(as.integer(grade)) by sex W = 4, p-value = 0.2619 alternative hypothesis: true location shift is not equal to 0 > wilcox.test(jitter(as.integer(grade)) ~ sex, data = gradesbysex) Wilcoxon rank sum test data: jitter(as.integer(grade)) by sex W = 3, p-value = 0.1667 alternative hypothesis: true location shift is not equal to 0 > wilcox.test(jitter(as.integer(grade)) ~ sex, data = gradesbysex) Wilcoxon rank sum test data: jitter(as.integer(grade)) by sex W = 7, p-value = 0.7143 alternative hypothesis: true location shift is not equal to 0 > wilcox.test(jitter(as.integer(grade)) ~ sex, data = gradesbysex) Wilcoxon rank sum test data: jitter(as.integer(grade)) by sex W = 6, p-value = 0.5476 alternative hypothesis: true location shift is not equal to 0 I'll let you judge elegance. As for the barplots, I think all you need to do is specify the row and column order you'd like. Try this example > barplot(VADeaths, beside = TRUE) > barplot(VADeaths[5:1,c(4, 2, 3, 1)], beside = TRUE) Substitute your data, use beside=FALSE to stack, etc. Steven McKinney From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of Par Leijonhufvud [...@hunter-gatherer.org] Sent: October 28, 2010 8:37 PM To: rhelp Subject: [R] wilcox.test; data type conversion? I'm working on a quick tutorial for my students, and was planning on using Mann-Whitney U as one of the tests. I have the following (fake) data grade <- c("MVG", "VG", "VG", "G", "MVG", "G", "VG", "G", "VG") sex <- c( "male", "male", "female", "male", "female", "male", "female", "male", "male") gradesbysex <- data.frame(grade, sex) The grades is in the Swedish system, where the order is G < VG < MVG The idea is that they will investigate if they can show a grade difference by sex (i.e. that the teacher gives better grades to boys or girls). Since the wilcox.test needs the order of the grades it wants numeric vector for the data. Is there a good and simple (i.e. student compatible) way to handle this? I could tell them to enter data as numbers instead, but an elegant way to do this inside R would be preferable. On the same theme, is there a way to tell barplot that, when making stacked barplots, to stack the data in a particular order (default appears to be alphabetical)? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to write to sqlite files
You will need to install the RSQLite package, see e.g. http://cran.r-project.org/web/packages/RSQLite/index.html Review the installation instructions there for the setup appropriate for your situation. Review the pdf manual there for examples of command sequences involved with connecting to the database, and running data queries. Steven McKinney From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of lord12 [trexi...@yahoo.com] Sent: October 19, 2010 11:42 AM To: r-help@r-project.org Subject: [R] How to write to sqlite files In R, I know how to write ti csv files. However, how do I write to database files? -- View this message in context: http://r.789695.n4.nabble.com/How-to-write-to-sqlite-files-tp3002586p3002586.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help
Here's one way to do it: Use unlist() to change the function's output from a list to a vector. > f <- function(x) {return(lapply(1:6, function(z) z + x))} > f(2) [[1]] [1] 3 [[2]] [1] 4 [[3]] [1] 5 [[4]] [1] 6 [[5]] [1] 7 [[6]] [1] 8 > x <- seq(0,1, by=0.1) > result <- matrix(0, nrow=6, ncol=length(x)) > for (i in 1:length(x)){result[,i] <- unlist(f(x[i]))} > result [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [1,]1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2 [2,]2 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 3 [3,]3 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4 [4,]4 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 5 [5,]5 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 6 [6,]6 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 7 > HTH Steven McKinney From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of li li [hannah@gmail.com] Sent: October 14, 2010 2:50 PM To: r-help Subject: [R] help Dear all, I have a function f(x) which return a list as result. $T1 [1] 0.03376190 $T2 [1] 0.04725 $T3 [1] 0.3796071 $T4 [1] 0.3713452 $T5 [1] 0.4523651 $T6 [1] 0.4575873 I now find the result for a vector of x values at one time. I want to store the reuslt for each xi value in a column of a matrix x <- seq(0,1, by=0.1) result <- matrix(0, nrow=6, ncol=length(x)) for (i in 1:length(x)){result[,i] <- f(x[i])} It is not working. Can some help me. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [Rd] Selecting multiple columns with same name
This is better suited for R-help than R-devel, so I'm copying to the R-help list: > -Original Message- > From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] On > Behalf Of Martin Kerr > Sent: October-08-10 3:09 AM > To: r-de...@r-project.org > Subject: [Rd] Selecting multiple columns with same name > > > Hello all, > I've been working on a project involving clustering algorithms and I've hit a > bit of a snag. > I have my main data frame with is 31 X 1000, I have fed this into dif and > hclust in order to produce a > 31 item vector stating the perceived grouping of the columns.E.g. > 1 1 1 1 2 2 2 2 1 1 1 1 2 2 2 2 3 3 3 3 > etc. > What I want to do is use this information to separate each groups worth of > data into a separate frame > so I can perform additional calculations on them.I've been attempting to use > subset by setting the > colnames to the grouping results thus: > colnames(dataFrame) <- groupssubset(dataFrame,select=c(colname="1") Here's one way to do it > fdf <- as.data.frame(matrix(rnorm(100), ncol = 10)) > fdf V1 V2 V3 V4 V5 V6 V7 V8 1 0.35264797 -0.4280407 0.4706150 -0.772936086 0.59984719 0.97885696 0.13569457 0.5005072 2 -0.09800830 -0.3946618 -0.6816040 -0.173057585 -0.95377116 1.32702531 0.51894946 1.8779715 3 0.00585569 0.5240508 0.6334294 0.775787713 1.13537433 -0.75363920 0.09240357 1.7652420 4 -1.28667042 -0.3808195 -1.3735447 0.601288920 0.37448709 1.20875897 1.26392905 0.3573046 5 1.05127892 -0.1717773 0.4795011 0.408584918 -1.57947076 -1.76699298 -2.15778156 -0.6202422 6 0.49935805 -0.5858645 0.1466443 1.094320479 -0.01534562 0.03349714 -0.86508986 0.3335337 7 0.64649298 -0.8044967 1.7273739 0.005654138 0.88092416 -0.43467177 0.33123616 -1.0062133 8 0.67393707 -0.8927181 1.9050954 0.824576116 -1.49872072 0.1361 -0.98904113 -1.1763053 9 -0.06217531 -0.6020426 -0.5198348 0.475774170 0.72492806 -1.93507347 -0.26827918 -0.7902781 10 -4.05961249 -1.1839906 -2.1285662 0.992767748 -1.45187700 -0.32688422 0.92335149 0.2405690 V9V10 1 -1.10422899 0.7343708 2 -0.21511926 -0.3472193 3 -1.56249900 0.6228027 4 -1.64679524 0.9548577 5 0.31530976 0.7420800 6 0.02644282 -1.0393438 7 -0.70669500 -0.8335578 8 -0.29898269 1.8679939 9 -0.08449491 -0.7413130 10 0.66960457 -0.464 > colGroups <- c(1,1,1,2,1,3,3,2,3,1) > fdf[, colGroups == 1] V1 V2 V3 V5V10 1 0.35264797 -0.4280407 0.4706150 0.59984719 0.7343708 2 -0.09800830 -0.3946618 -0.6816040 -0.95377116 -0.3472193 3 0.00585569 0.5240508 0.6334294 1.13537433 0.6228027 4 -1.28667042 -0.3808195 -1.3735447 0.37448709 0.9548577 5 1.05127892 -0.1717773 0.4795011 -1.57947076 0.7420800 6 0.49935805 -0.5858645 0.1466443 -0.01534562 -1.0393438 7 0.64649298 -0.8044967 1.7273739 0.88092416 -0.8335578 8 0.67393707 -0.8927181 1.9050954 -1.49872072 1.8679939 9 -0.06217531 -0.6020426 -0.5198348 0.72492806 -0.7413130 10 -4.05961249 -1.1839906 -2.1285662 -1.45187700 -0.464 > fdf[, colGroups == 2] V4 V8 1 -0.772936086 0.5005072 2 -0.173057585 1.8779715 3 0.775787713 1.7652420 4 0.601288920 0.3573046 5 0.408584918 -0.6202422 6 1.094320479 0.3335337 7 0.005654138 -1.0062133 8 0.824576116 -1.1763053 9 0.475774170 -0.7902781 10 0.992767748 0.2405690 > fdf[, colGroups == 3] V6 V7 V9 1 0.97885696 0.13569457 -1.10422899 2 1.32702531 0.51894946 -0.21511926 3 -0.75363920 0.09240357 -1.56249900 4 1.20875897 1.26392905 -1.64679524 5 -1.76699298 -2.15778156 0.31530976 6 0.03349714 -0.86508986 0.02644282 7 -0.43467177 0.33123616 -0.70669500 8 0.1361 -0.98904113 -0.29898269 9 -1.93507347 -0.26827918 -0.08449491 10 -0.32688422 0.92335149 0.66960457 > and this can be automated as a loop or with lapply() and the like. HTH Steve McKinney > This however only returns the first column rather than all instances of a > column with that name. Note > that these columns may not necessarily be contiguous. > Is this the correct way to go about this? > Thank You > Martin Kerr > [[alternative HTML version deleted]] > > __ > r-de...@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] interactive session
For those that want it all... > {cat("?"); a<-readLines(n=1) + print("hey") + print(b<-paste("t",a,sep=""))} ?ada [1] "hey" [1] "tada" > b [1] "tada" > Steven McKinney > -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf Of Peter Dalgaard > Sent: September-30-10 1:23 PM > To: Pam > Cc: r-help@r-project.org > Subject: Re: [R] interactive session > > On 09/30/2010 03:33 PM, Pam wrote: > > Thanks Niels but it won't do.. please copy and paste the 2 lines below > > together > > to your console in order to see what I mean: > > > > cat("?"); a<-readLines(n=1) > > b<-paste("t",a,sep="") > > > > anyone / any idea to overcome this problem? > > > > Best, > > Fatih > > > > You might want to source() a file with those lines rather than pasting > them to the console. There's just no way you can retroactively insert > text between two already submitted lines. > > You can do things like this, though > > {cat("?"); a<-readLines(n=1) > "hey" > b<-paste("t",a,sep="")} > > However, there's a catch > > > {cat("?"); a<-readLines(n=1) > + "hey" > + b<-paste("t",a,sep="")} > ?ada > > b > [1] "tada" > > Notice that the "hey" doesn't print. > > -- > Peter Dalgaard > Center for Statistics, Copenhagen Business School > Phone: (+45)38153501 > Email: pd@cbs.dk Priv: pda...@gmail.com > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] bptest
Hi Rob, I googled "cran bptest" and the first hit was for package "lmtest". I installed lmtest, loaded it up, and bptest is available there. > require("lmtest") Loading required package: lmtest Loading required package: zoo > bptest function (formula, varformula = NULL, studentize = TRUE, data = list()) ... So see if you can get package lmtest. HTH Cheers Steven McKinney From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of robm [r.malp...@ntlworld.com] Sent: September 24, 2010 6:53 AM To: r-help@r-project.org Subject: [R] bptest Hi I'm very new to R but have plenty of experience with statistics and other packages like SPSS, SAS etc. I have a dataset of around 20 columns and 200 rows. I'm trying to fit a very simple linear model between two variables. Having done so, I want to test the model for heteroscedasticity using the Breusch-Pagan test. Apparently this is easy in R by simply doing bptest(modelCH, data=KP) I've tried this but I'm told it cannot find function bptest. It's here where I'm struggling. I'm probably wrong but as far as I can see, bptest is part of the lm package which, as far as I know, I have installed. Irrespective of the fact I'm not sure how to tell bptest which is the dependent and explanatory variables - there's a more fundamental problem if it can't find the bptest function. I have searched the documentation - albeit briefly so if anyone could help I'd be very grateful Rob QBE Management -- View this message in context: http://r.789695.n4.nabble.com/bptest-tp2553506p2553506.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] French accents on characters
In addition to the reply from Duncan Murdoch, help("text") and demo("Hershey") are very informative in this regard. Simple example with plot for added text: plot(1,1) text(0.8, 0.8, "B\\`ad Fr\\'ench", vfont=c("serif", "plain")) title(main = "B\u{E0}d Fr\u{E9}nch") >From the help page for plotmath in package grDevices, see this example (to which I added two additional entries for a_grave and a_acute) for added text: > plot.new(); plot.window(c(0,4), c(15,1)) > text(1, 1, "universal", adj=0); text(2.5, 1, "\\042") > text(3, 1, expression(symbol("\042"))) > text(1, 2, "existential", adj=0); text(2.5, 2, "\\044") > text(3, 2, expression(symbol("\044"))) > text(1, 3, "suchthat", adj=0); text(2.5, 3, "\\047") > text(3, 3, expression(symbol("\047"))) > text(1, 4, "therefore", adj=0); text(2.5, 4, "\\134") > text(3, 4, expression(symbol("\134"))) > text(1, 5, "perpendicular", adj=0); text(2.5, 5, "\\136") > text(3, 5, expression(symbol("\136"))) > text(1, 6, "circlemultiply", adj=0); text(2.5, 6, "\\304") > text(3, 6, expression(symbol("\304"))) > text(1, 7, "circleplus", adj=0); text(2.5, 7, "\\305") > text(3, 7, expression(symbol("\305"))) > text(1, 8, "emptyset", adj=0); text(2.5, 8, "\\306") > text(3, 8, expression(symbol("\306"))) > text(1, 9, "angle", adj=0); text(2.5, 9, "\\320") > text(3, 9, expression(symbol("\320"))) > text(1, 10, "leftangle", adj=0); text(2.5, 10, "\\341") > text(3, 10, expression(symbol("\341"))) > text(1, 11, "rightangle", adj=0); text(2.5, 11, "\\361") > text(3, 11, expression(symbol("\361"))) > text(1, 12, "a_grave", adj=0); text(2.5, 12, "\\'a") > text(3, 12, "\\'a", vfont=c("serif", "plain")) > text(1, 13, "a_acute", adj=0); text(2.5, 13, "\\`a") > text(3, 13, "\\`a", vfont=c("serif", "plain")) > Run > demo("Hershey") and look at the functions defined at the beginning of the demo for more examples of writing commands to get these accented letters and other glyphs into plots. http://www.unicode.org/charts/ has useful information on unicode values for all kinds of alphabets. e.g. Click on the "Latin-1 Supplements" link (URL is: http://www.unicode.org/charts/PDF/U0080.pdf) to get a PDF with unicode values as used above (e.g. \u{E0} for a-aigu). HTH Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of Jennifer Young [jennifer.yo...@math.mcmaster.ca] Sent: August 4, 2010 12:40 PM To: r-help@r-project.org Subject: [R] French accents on characters Hello Could someone please direct me to the correct commands for adding accents (grave and aigu) to a letter in a plot title, label, or in added text? I'm sure there's a handy list somewhere, but I've failed in coming up with the correct search words to find it. Thank you muchly! Jen __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multiple R sessions from one working directory using GNU screen
Dear Olga An R session is conducted entirely in the RAM memory of your computer, and each invocation of R will have its own memory space, not shared with any other application, including another R session. You will have to architect a scheme to allow one R session to find out about events and objects in another R session. This might involve writing files to disk from session R1, having session R2 check the directory for new files every now and then, and so on. (save.image() saves objects from one R session to a file on disk, but there can be other items in the saved file such as environments that are not straightforward to manipulate or investigate from another R session.) Your situation is similar to cluster computing, where a task is broken up into independent pieces and each piece is handled by a separate R process. Reading about cluster computing with R might help you figure out strategies to share bits of data and information across multiple R sessions. HTH Steven McKinney From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of Olga Lyashevska [o...@herenstraat.nl] Sent: August 3, 2010 8:04 AM To: r-help@r-project.org Cc: Olga Lyashevska Subject: [R] multiple R sessions from one working directory using GNU screen Dear all, I am using GNU screen to run multiple R sessions from one working directory in order to split task, however I noticed that dataset is not synchronized e.g. if I have two sessions R1 and R2, and I remove an object from R1, R2 doesn't change as expected or change at random. I have tried to save.image(), q() and then restart both sessions, but it does not help. Any suggestions? Many thanks Olga R version 2.11.1 (2010-05-31) 2010 x86_64 GNU/Linux Ubuntu 10.04.1 LTS Intel(R) Xeon(R) CPU X3220 @ 2.40GHz MemTotal: 4050180 kB __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to do a IF ELSE in a matrix format
Here's one way: edu[edu[, 1] < 0.5, 2] <- 1 > set.seed(1) > edu<-matrix(0,1000,2) > edu1<-runif(1000,0,1) > edu[,1]<-edu1 > head(edu) [,1] [,2] [1,] 0.26550870 [2,] 0.37212390 [3,] 0.57285340 [4,] 0.90820780 [5,] 0.20168190 [6,] 0.89838970 > edu[edu[, 1] < 0.5, 2] <- 1 > head(edu) [,1] [,2] [1,] 0.26550871 [2,] 0.37212391 [3,] 0.57285340 [4,] 0.90820780 [5,] 0.20168191 [6,] 0.89838970 > Steven McKinney > -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf Of Hey Sky > Sent: August-02-10 6:17 PM > To: r-help@r-project.org > Subject: [R] how to do a IF ELSE in a matrix format > > Hey, Rers > > I am new to R and this question may has been asked many times or a very old > one. I have checked > > the archive but found nothing (maybe I used the wrong key words). any advise > is > appreciated > > the question is: > > I try to replace a col vector with some other values, see > > edu<-matrix(0,1000,2) > > edu1<-runif(1000,0,1) > edu[,1]<-edu1 > > the colum vector edu[,1] would have value between 0 and 1. if I want to > replace the value of edu[,2] > > to 1 if edu[,1] less than 0.5, how to do it in the form of matrix but > not under > a loop structure? > > I have tried: > > if (edu[,1]>=0.5) edu[,2]=1 > > there is only the first 0.5 have been done but not all the others. thanks for > any kind answer. > > Nan > from Montreal > > > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to find non-ASCII characters in .Rd files?
Does function showNonASCII(x) in package "tools" do what you need? Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of Michael Friendly [frien...@yorku.ca] Sent: July 29, 2010 6:27 PM To: r-help Subject: [R] how to find non-ASCII characters in .Rd files? [Env: Win XP Pro / R 2.11.1] I keep occasionally running into the annoying problem of getting warnings from R CMD check regarding non ASCII characters in .Rd files, without any easy way of finding them. Mostly these come from copy/paste of references or other material from web pages or Win applications, and often involve varieties of quote-like characters, e.g., if I try to include output from anova() as a documentation comment, #Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 These are extremely hard to spot in an editor and R CMD check doesn't give line numbers or indicate what non-ASCII characters were found, so the only way I can easily find the offending characters is to copy the .Rd file to a linux machine and run od -c file.Rd |more What commands can I run in R against a .Rd file to locate such non-ASCII characters -- Michael Friendly Email: frien...@yorku.ca Professor, Psychology Dept. York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 4700 Keele Streethttp://www.math.yorku.ca/SCS/friendly.html Toronto, ONT M3J 1P3 CANADA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] re-ordering bwplot
Here's one way - you can reorder the levels of a factor variable by specifying the levels the way you want them. > require("lattice") Loading required package: lattice > bwplot( conc ~ Type : Treatment, data = CO2 ) > str(CO2$Type) Factor w/ 2 levels "Quebec","Mississippi": 1 1 1 1 1 1 1 1 1 1 ... > > str(CO2$Treatment) Factor w/ 2 levels "nonchilled","chilled": 1 1 1 1 1 1 1 1 1 1 ... > ?bwplot > CO2 <- CO2 ## Make a copy in the global environment > CO2$Type2 <- factor(as.character(CO2$Type), levels = c("Mississippi", > "Quebec")) > with(CO2, table(Type, Type2)) Type2 Type Mississippi Quebec Quebec0 42 Mississippi 42 0 > quartz() ## Your plot window type here (quartz() works on a Mac) > bwplot( conc ~ Type : Treatment, data = CO2 ) > quartz() > bwplot( conc ~ Type2 : Treatment, data = CO2 ) > Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of Eck, Bradley J [brad@mail.utexas.edu] Sent: July 23, 2010 10:02 AM To: r-help@r-project.org Subject: [R] re-ordering bwplot Dear list: I'm using bwplot to compare concentrations by location and treatment as in: # using built in data bwplot( conc ~ Type : Treatment, data = CO2 ) I would like the order of the plots to be: 3,4,1,2. I can't seem to figure this out with index.cond or permc.cond. Any help is appreciated! Brad Eck [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can RMySQL run on Windows using a remote MySQL server?
So the answer is yes. Install the MySQL Workbench application on your XP box and work out the URL, account name and password, and port connection issues to the reomte MySQL database. You should then have a MySQL driver available. Set up a User DSN or System DSN to the remote database using ODBC Data Source Administrator widget in Windows administrative tools. Install RMySQL, configure and enjoy. Good luck! Steven McKinney From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of neatgadgets [neatgadg...@gmail.com] Sent: July 19, 2010 3:26 PM To: r-help@r-project.org Subject: Re: [R] Can RMySQL run on Windows using a remote MySQL server? I'm using Windows XP. -- View this message in context: http://r.789695.n4.nabble.com/Can-RMySQL-run-on-Windows-using-a-remote-MySQL-server-tp2293533p2294705.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can RMySQL run on Windows using a remote MySQL server?
I have done this from my previous computer, a Windows PC running Windows XP (32 bit). So in theory it can be done. I had installed the MySQL Workbench application as well which no doubt set up the MySQL dll's needed. My new computer is running Windows 7 64-bit for which RMySQL is not yet built. What OS are you running? Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of neatgadgets [neatgadg...@gmail.com] Sent: July 18, 2010 9:09 PM To: r-help@r-project.org Subject: [R] Can RMySQL run on Windows using a remote MySQL server? Can RMySQL run on Windows using a remote MySQL server? -- View this message in context: http://r.789695.n4.nabble.com/Can-RMySQL-run-on-Windows-using-a-remote-MySQL-server-tp2293533p2293533.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RMySQL package on 64bit R for Windows
I think this is a question for R-devel so I'm cross-posting there with apologies. I've just acquired a Windows 7 64-bit box and also will need RMySQL eventually. Is there any information about issues involved with compiling RMySQL for Windows 64-bit? Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre > -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf Of abeSRT > Sent: July-16-10 1:00 PM > To: r-help@r-project.org > Subject: Re: [R] RMySQL package on 64bit R for Windows > > > Did you ever find a solution for this? > -- > View this message in context: > http://r.789695.n4.nabble.com/RMySQL-package-on-64bit-R-for-Windows- > tp2248726p2291892.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] inverse function of melt
?cast A reproducible example would get you more feedback. Steven McKinney From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of n.via...@libero.it [n.via...@libero.it] Sent: June 18, 2010 6:39 AM To: r-help@r-project.org Subject: [R] inverse function of melt Dear list, I'm looking for an inverse function of melt(which is in package reshape).Namely, I had a data frame like this (Table1) YEAR VAR1 VAR2 VAR3 1995 7 3 45 1996 5 632 1997 6 10 15 I transformed my data by using the melt function and my data was reshaped in the following format: (Table2) YEARvariable value 1995VAR1 7 1996 VAR1 5 1997 VAR1 6 1995VAR2 7 1996 VAR2 5 1997 VAR2 6 1995VAR3 7 1996 VAR3 5 1997 VAR3 6 Now I would like to come back to the original format, namely table1. Anyone could help me?? Thanks for your attention! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Competing with SPSS and SAS: improving code that loops throughrows (data manipulation)
Hi Dimitri Your code is complex, so this won't be easy for anyone to deparse. I think part of the issue is in the calculation you are doing in your innermost loop data[data[[group.var]] %in% subgroup, name][case]= 1-((1-data[data[[group.var]] %in% subgroup, name][case-1]*exp(1)^d)/(exp(1)^(data[data[[group.var]] %in% subgroup, var][case]*l*10))) This looks to me to be a lag-1 calculation, and not all lagged calculations are vectorizable, so this part may have to be done in a loop. If so, working on a matrix instead of a dataframe can help speed up code dramatically, especially with big dataframes (I've seen 10-fold increases in speed by working with big matrices with no row or column names instead of dataframes). Pre-declare a matrix big enough to hold all your results outside of the loop, then do your calculations using the matrix inside the loop. You can convert the matrix to a dataframe after the loop completes if needed. HTH Steven McKinney From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of Dimitri Liakhovitski [ld7...@gmail.com] Sent: March 26, 2010 6:40 PM To: Bert Gunter Cc: r-help Subject: Re: [R] Competing with SPSS and SAS: improving code that loops throughrows (data manipulation) My sincere apologies if it looked large. Let me try again with less code. It's hard to do less than that. In fact - there is nothing in this code but 1 formula and many loops, which is the problem I am not sure how to solve. I also tried to be as clear as possible with the comments. Dimitri ## START OF THE CODE TO PRODUCE SMALL DATA EXAMPLE set.seed(123) data<-data.frame(group=c(rep("first",10),rep("second",10)),a=abs(round(rnorm(20,mean=0, sd=.55),2)), b=abs(round(rnorm(20,mean=0, sd=.55),2))) data # "data" it is the data frame to work with ## END OF THE CODE TO PRODUCE SMALL DATA EXAMPLE. In real life "data" would contain up to 150-200 rows PER SUBGROUP ### Specifying useful parameters used in the slow code below: vars<-names(data)[2:3]# names of variables used in transformation; in real life - up to 50-60 variables group.var<-names(data)[1]# name of the grouping variable subgroups<-levels(data[[group.var]]) # names of subgroups; in real life - up to 30 subgroups # OBJECTIVE: # Need to create new variables based on the old ones (a & b) # For each new variable, the value in a given row is a function of (a) 2 constants (that have several levels each), # (b) value of the original variable (e.g., a.ind.to.max"), and the value in the previous row on the same new variable # Plus - it has to be done by subgroup (variable "group") # Defining 2 constants: constant1<-c(1:3)# constant 1 used in transformation - has 3 levels, in real life - up to 7 levels constant2<-seq(.15,.45,.15) # constant 2 used in transformation - has 3 levels, in real life - up to 7 levels ### CODE THAT IS SLOW. Reason - too many loops with the inner-most loop being very slow - as it is looping through rows: for(var in vars){ # looping through variables for(c1 in 1:length(constant1)){# looping through values of constant1 for(c2 in 1:length(constant2)){ # looping through values of constant2 d=log(0.5)/constant1[c1] l=-log(1-constant2[c2]) name<-paste(var,constant1[c1],constant2[c2]*100,".transf",sep=".") data[[name]]<-NA for(subgroup in subgroups){ # looping through subgroups data[data[[group.var]] %in% subgroup, name][1] = 1-((1-0*exp(1)^d)/(exp(1)^(data[data[[group.var]] %in% subgroup, var][1]*l*10))) ### THIS SECTION IS THE SLOWEST - BECAUSE I AM LOOPING THROUGH ROWS: for(case in 2:nrow(data[data[[group.var]] %in% subgroup, ])){ # looping through rows data[data[[group.var]] %in% subgroup, name][case]= 1-((1-data[data[[group.var]] %in% subgroup, name][case-1]*exp(1)^d)/(exp(1)^(data[data[[group.var]] %in% subgroup, var][case]*l*10))) } ### END OF THE SLOWEST SECTION (INNERMOST LOOP) } } } } ### END OF THE CODE On Fri, Mar 26, 2010 at 5:25 PM, Bert Gunter wrote: > Dmitri: > > If you follow the R posting guide you're more likely to get useful replies. > In particular it asks for **small** reproducible examples -- your example is > far more code then I care to spend time on anyway (others may be more > willing or more able to do so of course). I suggest you try (if you haven't > already): > > 1. Profiling the code using Rprof to isolate where the time is spent.And > then... > > 2. Writing a **small** reproducible example to exercise that portion of the > code and post it with your question to the list. If you need to... > Typically,
Re: [R] Factor variables with GAM models
Hi Noah GAM models were developed to assess the functional form of the relationship of continuous predictor variables to the response, so weren't really meant to handle factor variables as predictor variables. GAMs are of the form E(Y | X1, X2, ...) = So + S(X1) + S(X2) + ... where S(X) is a smooth function of X. Hence you might want to rethink why you'd want a factor variable as a predictor variable in a GAM. This is why the gam machinery doesn't just do the factor conversion to indicator variables as is done in lm. HTH Steven McKinney From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of Noah Silverman [n...@smartmediacorp.com] Sent: March 19, 2010 12:54 PM To: r-help@r-project.org Subject: [R] Factor variables with GAM models I'm just starting to learn about GAM models. When using the lm function in R, any factors I have in my data set are automatically converted into a series of binomial variables. For example, if I have a data.frame with a column named color and values "red", "green", "blue". The lm function automatically replaces it with 3 variables colorred, colorgreen, colorblue which are binomial {0,1} When I use the gam function, R doesn't do this so I get an error. 1) Is there a way to ask the gam function to do this conversion for me? 2) If not, is there some other tool or utility to make this data transformation easy? 3) Last option - can I use lm to transform the data and then extract it into a new data.frame to then pass to gam? Thanks!!! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Merging Matrices
> From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf > Of duncandonutz [dwads...@unm.edu] > Sent: March 19, 2010 1:11 PM > To: r-help@r-project.org > Subject: [R] Merging Matrices > > I have two symmetric matrices, but of different dimensions. The entries are > identified by specific labels some of which are shared by both matrices. I > would like to sum the two matrices, but retain the union of the two. In > other words, I want the result to be the same size as the larger of the two > matrices, but with the entries that they share added together. > > cbind() and rbind() don't work since the matrices are different sizes > (symmetrically). Merge doesn't want to cooperate, although it might be > because I can't understand the documentation. I tried the package "reshape" > but didn't get very far with that either. I tried simply adding (+) them, > but that was a stupid first try. With appropriate indexing of the matrices via their shared names, it can be done as follows: > A <- matrix(1:9, nrow = 3) > A[lower.tri(A)] <- A[upper.tri(A)] > B <- matrix(1:25, nrow = 5) > B[lower.tri(B)] <- B[upper.tri(B)] > dimnames(A) <- list(letters[c(2, 4, 5)], letters[c(2, 4, 5)]) > dimnames(B) <- list(letters[1:5], letters[1:5]) > A b d e b 1 4 7 d 4 5 8 e 7 8 9 > B a b c d e a 1 6 11 16 21 b 6 7 12 17 22 c 11 17 13 18 23 d 12 18 22 19 24 e 16 21 23 24 25 > B[dimnames(A)[[1]], dimnames(A)[[1]]] b d e b 7 17 22 d 18 19 24 e 21 24 25 > C <- B > C[dimnames(A)[[1]], dimnames(A)[[1]]] <- A + B[dimnames(A)[[1]], > dimnames(A)[[1]]] > C a b c d e a 1 6 11 16 21 b 6 8 12 21 29 c 11 17 13 18 23 d 12 22 22 24 32 e 16 28 23 32 34 > C - B a b c d e a 0 0 0 0 0 b 0 1 0 4 7 c 0 0 0 0 0 d 0 4 0 5 8 e 0 7 0 8 9 > HTH Steve McKinney > Any help is appreciated! > -- > View this message in context: > http://n4.nabble.com/Merging-Matrices-tp1605474p1605474.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot frame border to start at zero?
> -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- > project.org] On Behalf Of Douglas M. Hultstrand > Sent: Tuesday, January 19, 2010 4:42 PM > To: R mailing list > Subject: [R] Plot frame border to start at zero? > > Hello, > > I am creating plots of hourly precipitation and accumulated > precipitation (on different axis, see attached image). I was wondering > how can I have the plot frame (black border) start at zero, it looks > like it is plotted less than zero? It doesn't appear to be plotted starting at less than zero. S language plots were designed from the beginning to have a little extra space around the data points so none of your data points get lost or obscured because they overstrike the borders. Thus the surrounding box is drawn about 5% outside of the data range in the X and Y axis directions, to avoid any data point overstrike. You can specify the x and y axis data ranges with the xlim and ylim arguments if you want complete control of the data range. It can be done, but you will have to work hard to get the graph to do bad things like print over your data! (See the various plot control arguments in e.g. help("par") ). > > The code I use to create the png files is below: > > CairoPNG(PNG_file,width=1000, height=600, pointsize=14, bg="white") > opar <- par(mai = c(.8, .8, 1, .8))# Set margins > plot(ppt,type="h",col="dark grey",ylim=c(0,max_ppt+1), lwd=6, > xlab='', ylab='', main=title, frame.plot=T, axes=F) > axis(2, col="black", las=2) > par(new=T) # R to overwrite the first plot > plot(accum, type="l", col="blue", axes=F, ylab='', xlab='', > lwd=2) > axis(1, col="black", las=1) > axis(4, pretty(c(0, max_accum) ), col="black", las=2) > legend("topleft", legend=c('Incremental', 'Accumulated'), > col=c("dark grey", "blue"), lwd=c(6, 2.0)) > legend((length(n)-4),max_accum, > legend=paste(round(max_accum,2)), bty="n", cex=.90) ### max accum > location > mtext(paste("Lat:", .lat, "Lon:", .lon), cex=0.95) > mtext("Index Hour", side=1, line=2) > mtext("Incremental Precipitation (in)", side=2, line=2.5) > mtext("Accumulated Precipitation (in)", side=4, line=2) > dev.off() > > > Thanks for all your help, > Doug > > - > Douglas M. Hultstrand, MS > Senior Hydrometeorologist > Metstat, Inc. Windsor, Colorado > voice: 970.686.1253 > email: dmhul...@metstat.com > web: http://www.metstat.com > - Steven McKinney, Ph.D. Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: smckinney -at- bccrc +dot+ ca tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help writing for loop
Hi, The great thing about the S language is that it is 'vectorized', so you can do a lot of matrix manipulations without loops. Here's a smaller example of what you describe. matrix A with 3 columns and 10 rows (instead of 100) matrix B with 3 columns and 15 rows (instead of 1500) > set.seed(123) > mA <- matrix(runif(30), ncol = 3) > mA [,1] [,2] [,3] [1,] 0.2875775 0.95683335 0.8895393 [2,] 0.7883051 0.45333416 0.6928034 [3,] 0.4089769 0.67757064 0.6405068 [4,] 0.8830174 0.57263340 0.9942698 [5,] 0.9404673 0.10292468 0.6557058 [6,] 0.0455565 0.89982497 0.7085305 [7,] 0.5281055 0.24608773 0.5440660 [8,] 0.8924190 0.04205953 0.5941420 [9,] 0.5514350 0.32792072 0.2891597 [10,] 0.4566147 0.95450365 0.1471136 > x <- seq(15) > mB <- cbind(1, x, x^2) > mB x [1,] 1 1 1 [2,] 1 2 4 [3,] 1 3 9 [4,] 1 4 16 [5,] 1 5 25 [6,] 1 6 36 [7,] 1 7 49 [8,] 1 8 64 [9,] 1 9 81 [10,] 1 10 100 [11,] 1 11 121 [12,] 1 12 144 [13,] 1 13 169 [14,] 1 14 196 [15,] 1 15 225 > mR <- apply(mB, 1, function(x){mA %*% x}) > mR [,1] [,2] [,3] [,4] [,5] [,6] [,7] [1,] 2.133950 5.759401 11.163931 18.347540 27.310227 38.05199 50.57284 [2,] 1.934443 4.466187 8.383538 13.686496 20.375061 28.44923 37.90901 [3,] 1.727054 4.326145 8.206250 13.367368 19.809500 27.53265 36.53681 [4,] 2.449921 6.005363 11.549346 19.081867 28.602929 40.11253 53.61067 [5,] 1.699098 3.769140 7.150594 11.843459 17.847736 25.16342 33.79052 [6,] 1.653912 4.679328 9.121806 14.981344 22.257943 30.95160 41.06232 [7,] 1.318259 3.196545 6.162963 10.217513 15.360195 21.59101 28.90995 [8,] 1.528621 3.353106 6.365876 10.566930 15.956267 22.53389 30.29979 [9,] 1.168515 2.363915 4.137635 6.489674 9.420032 12.92871 17.01571 [10,] 1.558232 2.954077 4.644149 6.628448 8.906974 11.47973 14.34671 [,8] [,9] [,10] [,11] [,12] [,13] [,14] [1,] 64.87276 80.95176 98.80984 118.44700 139.86324 163.05856 188.03295 [2,] 48.75440 60.98539 74.60199 89.60419 105.99201 123.76542 142.92445 [3,] 46.82198 58.38816 71.23536 85.36358 100.77281 117.46305 135.43430 [4,] 69.09735 86.57257 106.03633 127.48863 150.92947 176.35884 203.77676 [5,] 43.72904 54.97896 67.54029 81.41304 96.59720 113.09277 130.89975 [6,] 52.59011 65.53495 79.89685 95.67582 112.87184 131.48493 151.51508 [7,] 37.31703 46.81224 57.39559 69.06706 81.82667 95.67440 110.61027 [8,] 39.25398 49.39646 60.72722 73.24626 86.95358 101.84919 117.93309 [9,] 21.68102 26.92466 32.74662 39.14689 46.12549 53.68240 61.81763 [10,] 17.50792 20.96335 24.71302 28.75691 33.09502 37.72737 42.65394 [,15] [1,] 214.78642 [2,] 163.46908 [3,] 154.68657 [4,] 233.18322 [5,] 150.01814 [6,] 172.96229 [7,] 126.63428 [8,] 135.20527 [9,] 70.53119 [10,] 47.87474 > so the results matrix mR has one column for each element of x, and the i-th row element is the solution for y = a + bx + cx^2 for the i-th row of coefficients in the parameter estimates matrix mA You can spot check with e.g. > mA[1,] %*% mB[15,] [,1] [1,] 214.7864 > mA[1,] %*% mB[5,] [,1] [1,] 27.31023 > mA[1,] [1] 0.2875775 0.9568333 0.8895393 > mB[5,] x 1 5 25 > HTH Steven McKinney, Ph.D. Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: smckinney +at+ bccrc +dot+ ca tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of Jessica Schedlbauer [jsche...@fiu.edu] Sent: November 25, 2009 10:43 AM To: r-help@r-project.org Subject: [R] help writing for loop Hi, I’d like to ask for some help in writing a loop. My situation is the following: I have a matrix (matrix.A) containing 3 columns and 100 rows. The columns represent parameter estimates a, b, and c. The rows contain different values for these parameter estimates. Each row is unique. I want to insert these parameter estimates into a model (say, y = a + bx + cx^2) and solve for y given a separate matrix (matrix.B) of x values (where x has a length of 1500). I want to solve for y 100 times using each set of the parameter estimates in matrix.A once. At present my code looks like this and it only performs the first iteration. For (i in 1:length(matrix.A)) { y <- matrix.A$a[[i]] + matrixA$b[[i]] * matrix.B$x + matrixA$c[[i]] * matrix.B$x^2) I have not been able to figure out how to loop through the rows of parameter estimates in matrix.A. I am new to writing loops, so any assistance would be much appreciated. Regards, Jessica Schedlbauer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting
Re: [R] Where are usages like "== 2L" documented?
> -Original Message- > From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] > Sent: Monday, November 16, 2009 4:52 PM > To: Duncan Murdoch > Cc: Steven McKinney; R Help > Subject: Re: [R] Where are usages like "== 2L" documented? > > On Mon, Nov 16, 2009 at 7:25 PM, Duncan Murdoch > wrote: > > On 16/11/2009 6:47 PM, Steven McKinney wrote: > >> > >> ?NumericConstants > >> > >> will bring up a help page that mentions > >> "All other numeric constants start with a digit or period and are > either a > >> decimal or hexadecimal constant optionally followed by L." > >> > >> and > >> > >> "An numeric constant immediately followed by L is regarded as an > integer > >> number when possible (and with a warning if it contains a "."). " The word "integer" in the above sentence of the NumericConstants help page is hyperlinked to the integer() function page. There is then no example or discussion of L there. > >> > >> but I haven't found discussion of it anywhere else in the help > pages. > >> Others may know what other help pages discuss this. > >> > >> I'm surprised that the help page invoked from > >> ?integer > >> does not discuss this. Anyone know why not? > > > > This is part of the syntax of the language. It has nothing to do > with the > > integer() function, which is what ?integer is asking about. > > It might be useful to have a SeeAlso to NumericConstants on that help > page for those who looked up ?integer thinking it might be about > integer constants. Yes, additional discussion of "L" would be very valuable. I've had several people ask me about usages, as this original poster did. I think that increased use of L has outpaced updating of help entries. Given that L is appearing in more places, I'd like to request additional discussion of it and examples using it in help pages. > class(1L) [1] "integer" > storage.mode(1L) [1] "integer" Since "integer" is the term often associated with this language construct, that seems a natural place to say something about it, and direct users to other appropriate help pages. The help page for storage.mode() shows an example with "1i" in it, could "1L" please also be added? ("1.0" or "1." would also be useful.) cex3 <- c("NULL","1","1:1","1i","list(1)","data.frame(x=1)", "pairlist(pi)", "c", "lm", "formals(lm)[[1]]", "formals(lm)[[2]]", "y~x","expression((1))[[1]]", "(y~x)[[1]]", "expression(x <- pi)[[1]][[1]]") The "L" language construct is often used in length checks such as in the sample() function " if (length(x) == 1L ..." The length() function help page discusses " The default method currently returns an integer of length 1." again with the "integer" hyperlinked to the integer() help page. Since length() therefore can only assess integer lengths from 0 to about 2^31 - 1 it would be helpful to discuss this integer "L" construct and the range of values that can be expressed with mode "integer" more fully somewhere in one of these help topics. > sample function (x, size, replace = FALSE, prob = NULL) { if (length(x) == 1L && is.numeric(x) && x >= 1) { if (missing(size)) size <- x .Internal(sample(x, size, replace, prob)) } else { if (missing(size)) size <- length(x) x[.Internal(sample(length(x), size, replace, prob))] } } Best Steve McKinney __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Where are usages like "== 2L" documented?
?NumericConstants will bring up a help page that mentions "All other numeric constants start with a digit or period and are either a decimal or hexadecimal constant optionally followed by L." and "An numeric constant immediately followed by L is regarded as an integer number when possible (and with a warning if it contains a "."). " but I haven't found discussion of it anywhere else in the help pages. Others may know what other help pages discuss this. I'm surprised that the help page invoked from ?integer does not discuss this. Anyone know why not? Best Steve McKinney > -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- > project.org] On Behalf Of Bryan Hanson > Sent: Monday, November 16, 2009 3:23 PM > To: R Help > Subject: [R] Where are usages like "== 2L" documented? > > Gurus: > > I keep seeing other people¹s code that contain ideas like > > If (x == 2L) > X[-1L] > X - 1L > > I have some idea of what¹s going on, but where is the use of concepts > like > ³2L² documented? > > Thanks, Bryan > * > Bryan Hanson > Acting Chair > Professor of Chemistry & Biochemistry > DePauw University, Greencastle IN USA > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Adjusting Yaxis (ylim) limits on a plotMeans(DV, IV1, IV2, error.bars="se")
Hi Sergey I've attached a script with a ylim argument added to plotMeans. If you are using Rcmdr, you can load this script via the Rcmdr File menu item File Open Script File so save this plotMeans.R script somewhere, and load it and run it in Rcmdr, then do your plot command plotMeans(DV, IV1, IV2, error.bars="se", ylim = c(27, 99)) or whatever your desired ylim values are. HTH Steven McKinney, Ph.D. Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: smckinney +at+ bccrc +dot+ ca tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of Sergios (Sergey) Charntikov [sergios...@gmail.com] Sent: November 5, 2009 7:51 PM To: r-help@r-project.org Subject: [R] Adjusting Yaxis (ylim) limits on a plotMeans(DV, IV1, IV2, error.bars="se") Hello everyone, I have tried to look for this everywhere and so far have no luck. I have a plotMeans(DV, IV1, IV2, error.bars="se") graph that plots my data (DV-continuous, IVs are factors, IV1 - two levels, IV2-four levels). I am trying to increase a scale of my y-axis (to be consistent with my other graphs), but unfortunately nothing works with "plotMeans" function, which is the only one I see so fat to plot what I need. Could anyone provide any suggestions or advice? Thank you. Sergey (UNL, Behavioral Psychopharmacology Lab) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. plotMeans <- function (response, factor1, factor2, error.bars = c("se", "sd", "conf.int", "none"), level = 0.95, xlab = deparse(substitute(factor1)), ylab = paste("mean of", deparse(substitute(response))), legend.lab = deparse(substitute(factor2)), main = "Plot of Means", pch = 1:n.levs.2, lty = 1:n.levs.2, col = palette(), ylim = NULL) { if (!is.numeric(response)) stop(gettextRcmdr("Argument response must be numeric.")) xlab ylab legend.lab error.bars <- match.arg(error.bars) if (missing(factor2)) { if (!is.factor(factor1)) stop(gettextRcmdr("Argument factor1 must be a factor.")) valid <- complete.cases(factor1, response) factor1 <- factor1[valid] response <- response[valid] means <- tapply(response, factor1, mean) sds <- tapply(response, factor1, sd) ns <- tapply(response, factor1, length) if (error.bars == "se") sds <- sds/sqrt(ns) if (error.bars == "conf.int") sds <- qt((1 - level)/2, df = ns - 1, lower.tail = FALSE) * sds/sqrt(ns) sds[is.na(sds)] <- 0 yrange <- if (error.bars != "none") c(min(means - sds, na.rm = TRUE), max(means + sds, na.rm = TRUE)) else range(means, na.rm = TRUE) levs <- levels(factor1) n.levs <- length(levs) if ( is.null(ylim) ) { plot(c(1, n.levs), yrange, type = "n", xlab = xlab, ylab = ylab, axes = FALSE, main = main) } else { plot(c(1, n.levs), yrange, type = "n", xlab = xlab, ylab = ylab, axes = FALSE, main = main, ylim = ylim) } points(1:n.levs, means, type = "b", pch = 16, cex = 2) box() axis(2) axis(1, at = 1:n.levs, labels = levs) if (error.bars != "none") arrows(1:n.levs, means - sds, 1:n.levs, means + sds, angle = 90, lty = 2, code = 3, length = 0.125) } else { if (!(is.factor(factor1) | is.factor(factor2))) stop(gettextRcmdr("Arguments factor1 and factor2 must be factors.")) valid <- complete.cases(factor1, factor2, response) factor1 <- factor1[valid] factor2 <- factor2[valid] response <- response[valid] means <- tapply(response, list(factor1, factor2), mean) sds <- tapply(response, list(factor1, factor2), sd) ns <- tapply(response, list(factor1, factor2), length) if (error.bars == "se") sds <- sds/sqrt(ns) if (error.bars == "conf.int") sds <- qt((1 - level)/2, df = ns - 1, lower.tail = FALSE) * sds/sqrt(ns) sds[is.na(sds)] <- 0 yrange <- if (error.bars != "none") c(min(means - sds,
Re: [R] cox regression extract strata as numeric
> -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- > project.org] On Behalf Of ?e???da? ?pa?t?? > Sent: Tuesday, October 27, 2009 3:51 PM > To: r-help@r-project.org > Subject: [R] cox regression extract strata as numeric > > > Hi there, > > I perform a stratified cox and then I need the strata as a numeric > array "straft.ln" > ft.ln <- > coxph(Surv(times,deaths)~ages+chemos+chemos:f1+chemos:f2+horms+horms:f1 > +horms:f2+grades+grades:f1+grades:f2+positives+positives:f1+positives:f > 2+sizes+sizes:f1+sizes:f2+strata(stra),data=ddd) > > basehazzft.ln=basehaz(ft.ln,centered=FALSE) > H0ft.ln=c(basehazzft.ln[,1]) > timeft.ln=c(basehazzft.ln[,2]) > straft.ln=c(basehazzft.ln[,3]) > > I notice that basehazzft.ln[,3] is not the same with straft.ln with > respect to the values i.e.: > > (basehazzft.ln$stra[1]) > [1] stra=1 > 134 Levels: stra=1 stra=10 stra=100 stra=101 stra=102 ... stra=99 > > c(basehazzft.ln$stra[1]) > [1] 1 > > > so far so good but: > > (basehazzft.ln$stra[285]) > [1] stra=2 > 134 Levels: stra=1 stra=10 stra=100 stra=101 stra=102 ... stra=99 > > c(basehazzft.ln$stra[285]) > [1] 47 > > while the desired value is 2, I get a 47. What am I doing wrong? I > tried the as.numeric function but I have the same problem.. > > I just want instead of the [ "stra=1" "stra=1" "stra=1" etc] the > [1,1,1 etc] > > Thanx in advance for any answers! Here's one way to do it (using the example on coxph help page): > require("survival") Loading required package: survival Loading required package: splines > bladder1 <- bladder[bladder$enum < 5, ] > m1 <- coxph(Surv(stop, event) ~ (rx + size + number) * strata(enum) + > cluster(id), bladder1) > basehazzft.ln=basehaz(m1,centered=FALSE) > (basehazzft.ln$stra[1]) [1] enum=1 Levels: enum=1 enum=2 enum=3 enum=4 > basehazzft.ln$stra [1] enum=1 enum=1 enum=1 enum=1 enum=1 enum=1 enum=1 enum=1 enum=1 enum=1 enum=1 enum=1 enum=1 [14] enum=1 enum=1 enum=1 enum=1 enum=1 enum=1 enum=1 enum=1 enum=2 enum=2 enum=2 enum=2 enum=2 [27] enum=2 enum=2 enum=2 enum=2 enum=2 enum=2 enum=2 enum=2 enum=2 enum=2 enum=2 enum=2 enum=2 [40] enum=2 enum=3 enum=3 enum=3 enum=3 enum=3 enum=3 enum=3 enum=3 enum=3 enum=3 enum=3 enum=3 [53] enum=3 enum=3 enum=3 enum=3 enum=3 enum=4 enum=4 enum=4 enum=4 enum=4 enum=4 enum=4 enum=4 [66] enum=4 enum=4 enum=4 Levels: enum=1 enum=2 enum=3 enum=4 > as.character(basehazzft.ln$stra) [1] "enum=1" "enum=1" "enum=1" "enum=1" "enum=1" "enum=1" "enum=1" "enum=1" "enum=1" "enum=1" [11] "enum=1" "enum=1" "enum=1" "enum=1" "enum=1" "enum=1" "enum=1" "enum=1" "enum=1" "enum=1" [21] "enum=1" "enum=2" "enum=2" "enum=2" "enum=2" "enum=2" "enum=2" "enum=2" "enum=2" "enum=2" [31] "enum=2" "enum=2" "enum=2" "enum=2" "enum=2" "enum=2" "enum=2" "enum=2" "enum=2" "enum=2" [41] "enum=3" "enum=3" "enum=3" "enum=3" "enum=3" "enum=3" "enum=3" "enum=3" "enum=3" "enum=3" [51] "enum=3" "enum=3" "enum=3" "enum=3" "enum=3" "enum=3" "enum=3" "enum=4" "enum=4" "enum=4" [61] "enum=4" "enum=4" "enum=4" "enum=4" "enum=4" "enum=4" "enum=4" "enum=4" > substr(as.character(basehazzft.ln$stra), 6, 999) [1] "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "2" "2" "2" [25] "2" "2" "2" "2" "2" "2" "2" "2" "2" "2" "2" "2" "2" "2" "2" "2" "3" "3" "3" "3" "3" "3" "3" "3" [49] "3" "3" "3" "3" "3" "3" "3" "3" "3" "4" "4" "4" "4" "4" "4" "4" "4" "4" "4" "4" > as.numeric(substr(as.character(basehazzft.ln$stra), 6, 999)) [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 [49] 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 > HTH Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre > > > > > > ___ > ×ñçóéìïðïéåßôå Yahoo!; > ÂáñåèÞêáôå ôá åíï÷ëçôéêÜ ìçíýìáôá (spam); Ôï Yahoo! Mail > äéáèÝôåé ôçí êáëýôåñç äõíáôÞ ðñïóôáóßá êáôÜ ôùí åíï÷ëçôéêþí > ìçíõìÜôùí http://login.yahoo.com/config/mail?.intl=gr > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to apply the Wilcoxon test to a hole table at once?
> -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- > project.org] On Behalf Of Iurie Malai > Sent: Friday, October 23, 2009 10:46 AM > To: r-help@r-project.org > Subject: [R] How to apply the Wilcoxon test to a hole table at once? > > > Hi, > > I have a data set: > > > Dataset > X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 > 1 user1 m 22 19 28 24 12 18 9 7 4 5 4 7 5 7 9 > 2 user2 f 25 19 23 18 18 15 6 8 6 6 7 10 7 7 7 > 3 user3 f 28 21 24 18 15 12 10 6 7 9 5 10 5 9 5 > 4 user4 f 26 19 26 21 12 18 6 6 5 1 3 8 6 5 6 > 5 user5 m 21 22 26 18 9 6 4 6 1 7 2 4 4 6 4 > 6 user6 m 24 8 25 12 18 12 7 8 4 1 4 6 7 5 6 > ... > 71 user71 m 18 4 10 6 3 6 9 5 10 8 4 5 6 5 5 > > I can apply the Wilcoxon test on each column one by one, but how to do > this > on the hole table at once? > > > wilcox.test(X3 ~ X2, alternative="two.sided", data=Dataset) > > Wilcoxon rank sum test with continuity correction > > data: X3 by X2 > W = 439, p-value = 0.1291 > alternative hypothesis: true location shift is not equal to 0 > > > Here's one way to do it (using airquality dataset) > lapply(airquality[1:4], function(x) wilcox.test(x ~ Month, > alternative="two.sided", data=airquality, subset = Month <= 6)) $Ozone Wilcoxon rank sum test with continuity correction data: x by Month W = 82, p-value = 0.1925 alternative hypothesis: true location shift is not equal to 0 $Solar.R Wilcoxon rank sum test with continuity correction data: x by Month W = 391.5, p-value = 0.8354 alternative hypothesis: true location shift is not equal to 0 $Wind Wilcoxon rank sum test with continuity correction data: x by Month W = 566, p-value = 0.1461 alternative hypothesis: true location shift is not equal to 0 $Temp Wilcoxon rank sum test with continuity correction data: x by Month W = 78, p-value = 2.400e-08 alternative hypothesis: true location shift is not equal to 0 Warning messages: 1: In wilcox.test.default(x = c(41L, 36L, 12L, 18L, 28L, 23L, 19L, : cannot compute exact p-value with ties 2: In wilcox.test.default(x = c(190L, 118L, 149L, 313L, 299L, 99L, : cannot compute exact p-value with ties 3: In wilcox.test.default(x = c(7.4, 8, 12.6, 11.5, 14.3, 14.9, 8.6, : cannot compute exact p-value with ties 4: In wilcox.test.default(x = c(67L, 72L, 74L, 62L, 56L, 66L, 65L, : cannot compute exact p-value with ties # Sanity check: > wilcox.test(Temp ~ Month, alternative="two.sided", data=airquality, subset = > Month <= 6) Wilcoxon rank sum test with continuity correction data: Temp by Month W = 78, p-value = 2.400e-08 alternative hypothesis: true location shift is not equal to 0 Warning message: In wilcox.test.default(x = c(67L, 72L, 74L, 62L, 56L, 66L, 65L, : cannot compute exact p-value with ties > # Same as lapply loop result HTH Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre > > I researched on this, but I can't find a solution. > I would really appreciate any help. > > P.S. Excuse my lack of terminology :). > > Iurie Malai > Moldova Pedagogical State University > -- > View this message in context: http://www.nabble.com/How-to-apply-the- > Wilcoxon-test-to-a-hole-table-at-once--tp26030572p26030572.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Navigate to Index page of a package from R command prompt
Okay, I've worked out much of the details on my PC and Mac. What I'd like to be able to do is enter the command > help(package = "survival") or > ?survival:: and get the usual hyperlinked help page displayed instead of the static "text only" display or an error message. The help() function returns an object of class "help_files_with_topic". The object consists of a character vector with several attributes. PC: Windows XP > library("survival") > foo <- help("aareg", package = "survival") > class(foo) [1] "help_files_with_topic" > foo[1] [1] "C:/PROGRA~1/R/R-29~1.1/library/survival/chm/aareg" > attributes(foo) $call help(topic = "aareg", package = "survival") $pager [1] "internal" $topic [1] "aareg" $tried_all_packages [1] FALSE $type [1] "chm" $class [1] "help_files_with_topic" > bar <- help("", package = "survival") > class(bar) [1] "help_files_with_topic" > bar[1] [1] NA > attributes(bar) $call help(topic = "", package = "survival") $pager [1] "internal" $topic [1] "" $tried_all_packages [1] FALSE $type [1] "chm" $class [1] "help_files_with_topic" If I alter the character vector to point to "00Index" > bar[1] <- "C:/PROGRA~1/R/R-29~1.1/library/survival/chm/00Index" > bar I see exactly what I've been attempting to achieve. Mac OS X: > foo <- help("aareg", package = "survival") > foo[1] [1] "/Library/Frameworks/R.framework/Resources/library/survival/html/aareg.html" > attributes(foo) $call help(topic = "aareg", package = "survival") $pager [1] "/Library/Frameworks/R.framework/Resources/bin/pager" $topic [1] "aareg" $tried_all_packages [1] FALSE $type [1] "html" $class [1] "help_files_with_topic" > bar <- help("", package = "survival") > bar[1] [1] NA > bar[1] <- > "/Library/Frameworks/R.framework/Resources/library/survival/html/00Index.html" > bar Again I see exactly what I've been after. Running R in Emacs on Mac OS X: > foo <- help(topic = "aareg", package = "survival") > foo[1] [1] "/Library/Frameworks/R.framework/Resources/library/survival/html/aareg.html" > attributes(foo) $call help(topic = "aareg", package = "survival") $pager [1] "/Library/Frameworks/R.framework/Resources/bin/pager" $topic [1] "aareg" $tried_all_packages [1] FALSE $type [1] "html" $class [1] "help_files_with_topic" > bar <- help(topic = "", package = "survival") > bar[1] [1] NA > bar[1] <- > "/Library/Frameworks/R.framework/Resources/library/survival/html/00Index.html" > bar Help for '' is shown in browser /usr/bin/open ... Use help("", htmlhelp = FALSE) or options(htmlhelp = FALSE) to revert. > Again, what I've been trying to achieve. When a user loads a new library and doesn't yet know any function names, I think > help(package = "newLibrary") or > ?newLibrary:: should perform the above action whenever possible instead of producing static help or an error message. The "00Index" 'object' should be available for such use whenever it exists. (I've not yet investigated all the details to answer when it is and is not created in the Help building exercise of package building.) I can work out the coding details to make this happen, but am I missing some key point? Any reasons why this would be a Bad Idea? Best Steven McKinney, Ph.D. Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: smckinney -at- bccrc +dot+ ca tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada > -Original Message- > From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] > Sent: Thursday, July 23, 2009 4:48 PM > To: Steven McKinney > Cc: R-help@r-project.org > Subject: Re: [R] Navigate to Index page of a package from R command > prompt > > Try > > enter at the R console: help.start() > and then when the help comes up in your browser click on Packages > and then click on the package you want > and then click on the help file you want > > On Thu, Jul 23, 2009 at 3:30 PM, Steven McKinney > wrote: > > > > Hi all, > > > > Is there a way to navigate directly to the "Index" page of help > > for a package? > > > > Here's my connundrum: > > > > I download and install
Re: [R] Navigate to Index page of a package from R command prompt
> -Original Message- > From: Marc Schwartz [mailto:marc_schwa...@me.com] > Sent: Thursday, July 23, 2009 2:04 PM > To: Steven McKinney > Cc: R-help@r-project.org > Subject: Re: [R] Navigate to Index page of a package from R command > prompt > > On Jul 23, 2009, at 3:30 PM, Steven McKinney wrote: > > > > > Hi all, > > > > Is there a way to navigate directly to the "Index" page of help > > for a package? > > > > Here's my connundrum: > > > > I download and install package "foo". > > I don't know what functions are in package "foo", > > so I can't invoke the help for package "foo" via > >> ?someFunction > > > > help(package = "foo") > > pops up some non-hyperlinked information page, not > > package foo's help Index page. > > > > If the package author kindly made a "foo" object or function > > and put that in package "foo", then > >> ?foo > > works and yields a help page for package "foo". > > Now at the bottom of the help page is a hyperlink "Index" > > and I can click that to navigate to the main help Index page > > (the page I really want to get to straight from the R > > command line). > > > > I see that the link to "Index" for package "foo" appears always to be > > (on my Mac) > > > file:///Library/Frameworks/R.framework/Resources/library/foo/html/00Ind > ex.html > > > > e.g. > > > file:///Library/Frameworks/R.framework/Resources/library/cmprsk/html/00 > Index.html > > > > > file:///Library/Frameworks/R.framework/Resources/library/utils/html/00I > ndex.html > > > > Is there a command from the R listener that can take me directly to > > this "00Index.html" page of help for package "foo"? > > > > something like > >> help("00Index", package = "utils") > > > > (but this does not work)? > > > > Any info appreciated > > > > Best > > > > Steven McKinney > > > Steven, > > Once a package has been loaded with 'library(PackageName)', you can > use .path.package("PackageName") to get the path to the main package > installation folder and then append the remainder: > > library(survival) > > > .path.package("survival") > [1] "/Library/Frameworks/R.framework/Resources/library/survival" > > > file.path(.path.package("survival"), "html", "00Index.html") > [1] "/Library/Frameworks/R.framework/Resources/library/survival/html/ > 00Index.html" > > See ?.path.package and ?file.path for more information. > > You can also use .find.package() for unloaded packages and that is > described on the same help page as .path.package. > > Finally, you can use browseURL() to bring the index file up in your > browser: > >browseURL(file.path(.path.package("survival"), "html", > "00Index.html")) > > HTH, > > Marc Schwartz Thanks Marc, Certainly useful information. I'm just baffled and confused as to why > ?survival or > ?survival:Index or some such incantation doesn't just take me right to the index page for package "survival" in the default help browser for the R session. Why does it take so much work to get to the index page? > library("survival") > browseURL(file.path(.path.package("survival"), "html", "00Index.html")) does conceptually do what I'm looking for, but not in the default browser. I'd like to file a feature request on R-devel if this is not too naïve of a request. ?survival:Index or some such syntax should do what you illustrate above, but in the default help mechanism. I don't yet know enough about the ? and ?? machinery to develop such a capability. Any ideas? Best Steve McKinney __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mathematical notation in R
Hi Mary, Something such as > plot(1,1) > title(expression(paste("LBAuo = -", infinity))) Is this what you're after? Best Steve McKinney From: Mary A. Marion [mms...@comcast.net] Sent: July 23, 2009 4:23 PM To: Steven McKinney Subject: Re: [R] mathematical notation in R Hi, LBAuo <- -Inf was helpful but what I really would like to do is use the "Greek" expression for infinity that is the number 8 on its side. syntax meaning infinityinfinity symbol was found using ?plotmath but I don't know how to use this information. Any ideas? Thank you. Sincerely, Mary A. Marion Steven McKinney wrote: Does this approach what you're looking for? See ?is.finite for more info LBAuo<- - LBAuo [1] - if (LBAuo <= -9999) LBAuo <- -Inf LBAuo [1] -Inf HTH Steven McKinney -Original Message- From: r-help-boun...@r-project.org<mailto:r-help-boun...@r-project.org> [mailto:r-help-boun...@r- project.org] On Behalf Of Mary A. Marion Sent: Wednesday, July 22, 2009 8:00 PM To: r-help@r-project.org<mailto:r-help@r-project.org> Subject: [R] mathematical notation in R Hello, I have an R function which includes the statement LBAuo<- -. Rather than printing out - I want it to print out - infinity. As of yet I have not been able to do that. I am not in a graphics window just outputting a set of variables. Can you help? Thanks. Sincerely, Mary A. Marion __ R-help@r-project.org<mailto:R-help@r-project.org> mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Navigate to Index page of a package from R command prompt
Hi all, Is there a way to navigate directly to the "Index" page of help for a package? Here's my connundrum: I download and install package "foo". I don't know what functions are in package "foo", so I can't invoke the help for package "foo" via > ?someFunction help(package = "foo") pops up some non-hyperlinked information page, not package foo's help Index page. If the package author kindly made a "foo" object or function and put that in package "foo", then > ?foo works and yields a help page for package "foo". Now at the bottom of the help page is a hyperlink "Index" and I can click that to navigate to the main help Index page (the page I really want to get to straight from the R command line). I see that the link to "Index" for package "foo" appears always to be (on my Mac) file:///Library/Frameworks/R.framework/Resources/library/foo/html/00Index.html e.g. file:///Library/Frameworks/R.framework/Resources/library/cmprsk/html/00Index.html file:///Library/Frameworks/R.framework/Resources/library/utils/html/00Index.html Is there a command from the R listener that can take me directly to this "00Index.html" page of help for package "foo"? something like > help("00Index", package = "utils") (but this does not work)? Any info appreciated Best Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mathematical notation in R
Does this approach what you're looking for? See ?is.finite for more info > LBAuo<- - > LBAuo [1] - > if (LBAuo <= -) LBAuo <- -Inf > LBAuo [1] -Inf > HTH Steven McKinney > -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- > project.org] On Behalf Of Mary A. Marion > Sent: Wednesday, July 22, 2009 8:00 PM > To: r-help@r-project.org > Subject: [R] mathematical notation in R > > Hello, > > I have an R function which includes the statement LBAuo<- -. > Rather > than printing out - I want it to print out - infinity. > As of yet I have not been able to do that. I am not in a graphics > window just outputting a set of variables. Can you help? > Thanks. > > Sincerely, > Mary A. Marion > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Referencing data frames
Hi Payam, Here's a basic example using pointless data frames and an otherwise useless function to illustrate the issues you want > ## Fresh R session with nothing yet defined > foo <- data.frame(matrix(1:12, ncol = 3)) > bar <- data.frame(matrix(101:112, ncol = 3)) > > objects() [1] "bar" "foo" > > myrbind <- function(names) { + df1 <- get(names[1]) + df2 <- get(names[2]) + dfrbind <- rbind(df1, df2) + dfrbind.name <- paste(names, collapse = ".") + assign(dfrbind.name, dfrbind, pos = 1) + invisible() + } > > dfnames <- c("foo", "bar", "baz") > myrbind(dfnames) > objects() [1] "bar" "dfnames" "foo" "foo.bar.baz" "myrbind" > foo.bar.baz X1 X2 X3 1 1 5 9 2 2 6 10 3 3 7 11 4 4 8 12 5 101 105 109 6 102 106 110 7 103 107 111 8 104 108 112 > HTH Steven McKinney, Ph.D. Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: smckinney +at+ bccrc +dot+ ca tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada -Original Message- From: r-help-boun...@r-project.org on behalf of Payam Minoofar Sent: Fri 6/12/2009 3:00 PM To: r-help@r-project.org Subject: [R] Referencing data frames Hi, How do I use the string content of a string variable to reference a data frame of the same name? I want to do the typical tasks of 1) building a name with a string variable and using the string variable to create a data frame (or any object) whose name is the string value of the variable and 2) pass on a string to a function as a parameter, and then use that string to refer to an existing data frame. Thanks in advance. Payam -- Payam Minoofar, Ph.D. Scientist Meissner Filtration Products 4181 Calle Tesoro Camarillo, CA 93012 USA +1 805 388 9911 +1 805 388 5948 fax payam.minoo...@meissner.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Need help understanding output from aov and from anova
Hi Suman, What version of R are you running? In R 2.9.0 running your first example yields a warning Warning message: In anova.lm(lm(vtot ~ fac)) : ANOVA F-tests on an essentially perfect fit are unreliable so some adept R developer has taken the time to figure out how to warn you about such a problem. Perhaps someone will add this to aov() at some point as well. The only variability in this problem is that introduced by machine precision rounding errors. The exercise of submitting data with no variability to a program designed to assess variability cannot be expected to produce meaningful output, so there's nothing to understand except the issue of machine precision. Machine roundoff error is an important topic, so I'd recommend learning about that issue, which will do most to help understand these examples. Best SteveM R version 2.9.0 (2009-04-17) Copyright (C) 2009 The R Foundation for Statistical Computing ISBN 3-900051-07-0 R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > vtot=c(7.29917, 7.29917, 7.29917) #identical values > fac=as.factor(c(1,1,2)) #group 1 has first two elements, group 2 has > anova(lm(vtot~fac)) Analysis of Variance Table Response: vtot Df Sum SqMean Sq F value Pr(>F) fac1 1.6818e-30 1.6818e-30 0. 0.6667 Residuals 1 5.0455e-30 5.0455e-30 Warning message: In anova.lm(lm(vtot ~ fac)) : ANOVA F-tests on an essentially perfect fit are unreliable > > summary(aov(vtot~fac)) Df Sum SqMean Sq F value Pr(>F) fac 1 1.6818e-30 1.6818e-30 0. 0.6667 Residuals1 5.0455e-30 5.0455e-30 > > fac=as.factor(c(1,2,2)) > anova(lm(vtot~fac)) Analysis of Variance Table Response: vtot Df Sum SqMean SqF valuePr(>F) fac1 6.7274e-30 6.7274e-30 1.3340e+32 < 2.2e-16 *** Residuals 1 5.043e-62 5.043e-62 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Warning message: In anova.lm(lm(vtot ~ fac)) : ANOVA F-tests on an essentially perfect fit are unreliable > Steven McKinney, Ph.D. Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: smckin...@bccrc.ca tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada > -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- > project.org] On Behalf Of Suman Sundaresh > Sent: Wednesday, June 03, 2009 3:55 PM > To: r-help@r-project.org > Subject: [R] Need help understanding output from aov and from anova > > Hi all, > > I noticed something strange when I ran aov and anova. > > vtot=c(7.29917, 7.29917, 7.29917) #identical values > fac=as.factor(c(1,1,2)) #group 1 has first two elements, group 2 has > the 3rd element > > When I run: > > anova(lm(vtot~fac)) > Analysis of Variance Table > > Response: vtot > Df Sum SqMean Sq F value Pr(>F) > fac1 1.6818e-30 1.6818e-30 0. 0.6667 > Residuals 1 5.0455e-30 5.0455e-30 > > > I get a p-value of 0.667. This seems strange to me. I would have > expected the p-value to be NaN. > > Again, when I run: > > summary(aov(vtot~fac)) > Df Sum SqMean Sq F value Pr(>F) > fac 1 1.6818e-30 1.6818e-30 0. 0.6667 > Residuals1 5.0455e-30 5.0455e-30 > > Again same p-value. > > > Now, if I set fac to c(1,2,2) which is essentially just switching the > groups. > fac=as.factor(c(1,2,2)) > > And run, > > anova(lm(vtot~fac)) > Analysis of Variance Table > > Response: vtot > Df Sum SqMean SqF valuePr(>F) > fac1 6.7274e-30 6.7274e-30 1.3340e+32 < 2.2e-16 *** > Residuals 1 5.043e-62 5.043e-62 > --- > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > > > The p-value is really significant which again looks very strange. > > Please could someone shed some light on what I may be missing here? > > Thanks very much. > Suman. > >
Re: [R] Warning trying to plot -log(log(survival))
Hi John, Thanks for the data. I put your data in a plain text file and read it in without any problems. > jsdf <- read.csv("js.csv") > jsdf Time Time30 Died Age Rx 1 3 0.1000 40 2 2 8 0.2671 21 2 310 0.3331 18 2 412 0.4000 42 2 516 0.5331 23 2 ... I can fit the model you specified and obtain the plot without problem. So this looks like an issue with R 2.8.1 and whatever survival package version you have. I have R 2.9.0 and survival_2.35-4. Can you update your versions? HTH > fit0 <- coxph(Surv(Time30, Died) ~ strata(Rx) + Age, data = jsdf) > fit0 Call: coxph(formula = Surv(Time30, Died) ~ strata(Rx) + Age, data = jsdf) coef exp(coef) se(coef)z p Age 0.0611 1.06 0.0257 2.38 0.017 Likelihood ratio test=5.84 on 1 df, p=0.0156 n= 64 > plot(survfit(fit0),fun="cloglog") ### No warning, two curves plotted. > sessionInfo() R version 2.9.0 (2009-04-17) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] splines stats graphics grDevices utils datasets methods [8] base other attached packages: [1] survival_2.35-4 > Steven McKinney > -Original Message- > From: John Sorkin [mailto:jsor...@grecc.umaryland.edu] > Sent: Monday, May 11, 2009 8:53 PM > To: Steven McKinney; r-h...@stat.math.ethz.ch > Subject: RE: [R] Warning trying to plot -log(log(survival)) > > Time30 is simply Time/30 and reflects months of follow-up. > A copy of my data follows: > Time,Time30,Died,Age,Rx > 3,0.1,0,40,2 > 8,0.26667,1,21,2 > 10,0.3,1,18,2 > 12,0.4,0,42,2 > 16,0.5,1,23,2 > 17,0.56667,1,21,2 > 22,0.7,1,13,2 > 64,2.1,0,20,2 > 65,2.16667,0,15,2 > 77,2.56667,0,34,2 > 82,2.7,0,14,2 > 98,3.26667,0,10,2 > 155,5.16667,0,27,2 > 189,6.3,0,9,2 > 199,6.6,0,19,2 > 247,8.2,0,14,2 > 324,10.8,0,23,2 > 356,11.8667,0,13,2 > 378,12.6,0,34,2 > 408,13.6,0,27,2 > 411,13.7,0,5,2 > 420,14,0,23,2 > 449,14.9667,0,37,2 > 490,16.,0,37,2 > 528,17.6,0,32,2 > 547,18.2333,0,32,2 > 691,23.0333,0,38,2 > 769,25.6333,0,18,2 > ,37.0333,0,20,2 > 1173,39.1,0,12,2 > 1213,40.4333,0,12,2 > 1357,45.2333,0,29,2 > 9,0.3,1,35,1 > 11,0.36667,1,27,1 > 12,0.4,1,22,1 > 20,0.7,1,21,1 > 20,0.7,1,30,1 > 22,0.7,1,7,1 > 25,0.8,1,36,1 > 25,0.8,1,38,1 > 25,0.8,0,20,1 > 28,0.9,1,25,1 > 28,0.9,1,28,1 > 31,1.0,1,17,1 > 35,1.16667,1,21,1 > 35,1.16667,1,25,1 > 46,1.5,1,35,1 > 49,1.6,1,19,1 > 104,3.46667,0,27,1 > 106,3.5,0,19,1 > 156,5.2,0,15,1 > 218,7.26667,0,26,1 > 230,7.7,0,11,1 > 231,7.7,0,14,1 > 316,10.5333,0,15,1 > 393,13.1,0,27,1 > 395,13.1667,0,2,1 > 428,14.2667,0,3,1 > 469,15.6333,0,14,1 > 602,20.0667,0,18,1 > 681,22.7,0,23,1 > 690,23,0,9,1 > 1112,37.0667,0,11,1 > 1180,39.,0,11,1 > > > > > John David Sorkin M.D., Ph.D. > Chief, Biostatistics and Informatics > University of Maryland School of Medicine Division of Gerontology > Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) > Baltimore, MD 21201-1524 > (Phone) 410-605-7119 > (Fax) 410-605-7913 (Please call phone number above prior to faxing) > > >>> Steven McKinney 5/11/2009 10:04 PM >>> > Hi John, > > I can't reproduce your case with built-in data sets. > > Your data set is small. Can you show the data for the variables > involved in your example? > (Time30, Died, Rx, Age) > > > > > > Steven McKinney, Ph.D. > > Statistician > Molecular Oncology and Breast Cancer Program British Columbia Cancer > Research Centre > > email: smckin...@bccrc.ca > tel: 604-675-8000 x7561 > > BCCRC > Molecular Oncology > 675 West 10th Ave, Floor 4 > Vancouver B.C. > V5Z 1L3 > > Canada > > > > > > > > > -Original Message- > > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- > > project.org] On Behalf Of John Sorkin > > Sent: Monday, May 11, 2009 12:19 PM > > To: r-h...@stat.math.ethz.ch > > Subject: [R] Warning trying to plot -log(log(survival)) > > > > windows xp > > R 2.8.1 > > > > I am trying to plot the -log(log(survival)) to visually test the > > proportional hazards assumption of a Cox regression. The plot, wh
Re: [R] Warning trying to plot -log(log(survival))
Hi John, I can't reproduce your case with built-in data sets. Your data set is small. Can you show the data for the variables involved in your example? (Time30, Died, Rx, Age) Steven McKinney, Ph.D. Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: smckin...@bccrc.ca tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada > -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- > project.org] On Behalf Of John Sorkin > Sent: Monday, May 11, 2009 12:19 PM > To: r-h...@stat.math.ethz.ch > Subject: [R] Warning trying to plot -log(log(survival)) > > windows xp > R 2.8.1 > > I am trying to plot the -log(log(survival)) to visually test the > proportional hazards assumption of a Cox regression. The plot, which > should give two lines (one for each treatment) gives only one line and > a warning message. I would appreciate help getting two lines, and an > explanation of the warning message. My problem may the that I have very > few events in one of my strata, but I don't know. > Thanks, > John > > fit0<-coxph(Surv(Time30,Died)~strata(Rx)+Age,data=GVHDdata) > plot(survfit(fit0),fun="cloglog") > > > WARNING: Warning in xy.coords (x, y, xlabel, ylabel, log) : > 2 x values <=0 omitted from logarithmic plot > > > > > print(survfit(fit0),fun="cloglog") > Call: survfit.coxph(object = fit0) > > n events median 0.95LCL 0.95UCL > Rx=0 31 5Inf Inf Inf > Rx=1 32 15Inf1.17 Inf > > John David Sorkin M.D., Ph.D. > Chief, Biostatistics and Informatics > University of Maryland School of Medicine Division of Gerontology > Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) > Baltimore, MD 21201-1524 > (Phone) 410-605-7119 > (Fax) 410-605-7913 (Please call phone number above prior to faxing) > > Confidentiality Statement: > This email message, including any attachments, is for > th...{{dropped:6}} > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Corrupt data frame construction - bug?
Thanks Duncan, Comments and a proposed bug fix in-line below: > -Original Message- > From: Duncan Murdoch [mailto:murd...@stats.uwo.ca] > Sent: Wednesday, April 29, 2009 5:10 PM > To: Steven McKinney > Cc: R-help@r-project.org > Subject: Re: [R] Corrupt data frame construction - bug? > > On 29/04/2009 6:41 PM, Steven McKinney wrote: > > Hi useRs, > > > > A recent coding infelicity along these lines yielded a corrupt data > > frame. > > > > foo <- matrix(1:12, nrow = 3) > > bar <- data.frame(foo) > > bar$NewCol <- foo[foo[, 1] == 4, 4] > > bar > > lapply(bar, length) > > > > > > > > > >> foo <- matrix(1:12, nrow = 3) > >> bar <- data.frame(foo) > >> bar$NewCol <- foo[foo[, 1] == 4, 4] > >> bar > > X1 X2 X3 X4 NewCol > > 1 1 4 7 10 > > 2 2 5 8 11 > > 3 3 6 9 12 > > Warning message: > > In format.data.frame(x, digits = digits, na.encode = FALSE) : > > corrupt data frame: columns will be truncated or padded with NAs > >> lapply(bar, length) > > $X1 > > [1] 3 > > > > $X2 > > [1] 3 > > > > $X3 > > [1] 3 > > > > $X4 > > [1] 3 > > > > $NewCol > > [1] 0 > > > > > > Is this a bug in the data.frame machinery? > > If an attempt is made to add a new column to a data frame, and the > new > > object does not have length = number of rows of data frame, or cannot > > be made to have such length via recycling, shouldn't an error be > > thrown? > > > > Instead in this example I end up with a "corrupt data frame" having > > one zero-length column. > > > > > > Should this be reported as a bug, or did I misinterpret the > > documentation? > > I don't think "$" uses any data.frame machinery. You are working at a > lower level. > > If you had added the new column using > > bar <- data.frame(bar, NewCol=foo[foo[, 1] == 4, 4]) > > you would have seen the error: > > Error in data.frame(bar, NewCol = foo[foo[, 1] == 4, 4]) : >arguments imply differing number of rows: 3, 0 > > But since you treated it as a list, it let you go ahead and create > something that was labelled as a data.frame but wasn't. This is one of > the reasons some people prefer S4 methods: it's easier to protect > against people who mislabel things. > I did some more digging on '$' - there is a data.frame method for it: > getAnywhere("$<-.data.frame" ) A single object matching '$<-.data.frame' was found It was found in the following places package:base registered S3 method for $<- from namespace base namespace:base with value function (x, i, value) { cl <- oldClass(x) class(x) <- NULL nrows <- .row_names_info(x, 2L) if (!is.null(value)) { N <- NROW(value) if (N > nrows) stop(gettextf("replacement has %d rows, data has %d", N, nrows), domain = NA) if (N < nrows && N > 0L) if (nrows%%N == 0L && length(dim(value)) <= 1L) value <- rep(value, length.out = nrows) else stop(gettextf("replacement has %d rows, data has %d", N, nrows), domain = NA) if (is.atomic(value)) names(value) <- NULL } x[[i]] <- value class(x) <- cl return(x) } > I placed a browser() command before return(x) and did some poking around. It seems to me there's a bug in this function. It should be able to detect the problem I threw at it, and throw an error as you point out is thrown by the other data.frame assign method. I modified the rows if (N < nrows && N > 0L) if (nrows%%N == 0L && length(dim(value)) <= 1L) to read if (N < nrows) if (N > 0L && nrows%%N == 0L && length(dim(value)) <= 1L) as in "$<-.data.frame" <- function (x, i, value) { cl <- oldClass(x) class(x) <- NULL nrows <- .row_names_info(x, 2L) if (!is.null(value)) { N <- NROW(value) if (N > nrows) stop(gettextf("replacement has %d rows, data has %d", N, nrows), domain = NA) if (N < nrows) if (N > 0L && nrows%%N == 0L && length(dim(value)) <= 1L) value <- rep(value, length.out = nrows) else stop(gettextf("replacement has %d rows, data has %d", N, nrows), domain = NA) if (is.atomic(value)) names(value
[R] Corrupt data frame construction - bug?
Hi useRs, A recent coding infelicity along these lines yielded a corrupt data frame. foo <- matrix(1:12, nrow = 3) bar <- data.frame(foo) bar$NewCol <- foo[foo[, 1] == 4, 4] bar lapply(bar, length) > foo <- matrix(1:12, nrow = 3) > bar <- data.frame(foo) > bar$NewCol <- foo[foo[, 1] == 4, 4] > bar X1 X2 X3 X4 NewCol 1 1 4 7 10 2 2 5 8 11 3 3 6 9 12 Warning message: In format.data.frame(x, digits = digits, na.encode = FALSE) : corrupt data frame: columns will be truncated or padded with NAs > lapply(bar, length) $X1 [1] 3 $X2 [1] 3 $X3 [1] 3 $X4 [1] 3 $NewCol [1] 0 Is this a bug in the data.frame machinery? If an attempt is made to add a new column to a data frame, and the new object does not have length = number of rows of data frame, or cannot be made to have such length via recycling, shouldn't an error be thrown? Instead in this example I end up with a "corrupt data frame" having one zero-length column. Should this be reported as a bug, or did I misinterpret the documentation? > sessionInfo() R version 2.9.0 (2009-04-17) powerpc-apple-darwin8.11.1 locale: en_CA.UTF-8/en_CA.UTF-8/C/C/en_CA.UTF-8/en_CA.UTF-8 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] nlme_3.1-90 loaded via a namespace (and not attached): [1] grid_2.9.0 lattice_0.17-22 tools_2.9.0 > Also occurs on Windows box with R 2.8.1 Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: smckinney +at+ bccrc +dot+ ca tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] create objects in a loop and adding sqlQuery content to them
> > > > > #Here my first try [error message on the line within the loop, saying > > something like: > > # 'recursive indexing on level 2 failed'] > > sites_object_list <- vector("list",99) > > If there's a site number larger than 99 this will be problematic. Sorry, this isn't correct - R will just increase the list size if a site number is bigger than 99. Steve McKinney __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] create objects in a loop and adding sqlQuery content to them
Hello > -Original Message- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] > On Behalf Of Katharina May > Sent: Monday, April 20, 2009 5:05 PM > To: r-help@r-project.org > Subject: [R] create objects in a loop and adding sqlQuery content to them > > Hi there, > > I've got a database or rather spreadsheet with several columns and rows. > For one column named sites I want to loop through all possible values > and retrieve > all data out of the database where site = x and write it into an > object named 'sitex_data'. > > Somehow I'm really missing something as I'm not able to create these > sitex_data objects with > the database values, neither using list nor assign... > > Here some code snipplets: > >library (RODBC) > >channel <- odbcConnectExcel2007 ("biomass_data.xlsx") > >site_data <- sqlQuery(channel, "select site_no from [biomass_data > $] group by site_no") > > #Here the values I want to loop through > >str(site_data) > 'data.frame': 44 obs. of 1 variable: > $ site_no: num 4 7 9 10 15 16 17 18 19 20 ... There's only 44 of them - can you show the whole thing? e.g. >site_data Also check the class of site_data$site_no It might be of class character or factor and not just numeric as it might initially appear. > > #Here my first try [error message on the line within the loop, saying > something like: > # 'recursive indexing on level 2 failed'] > sites_object_list <- vector("list",99) If there's a site number larger than 99 this will be problematic. > for(i in site_data) { > sites_object_list[[i]] <- sqlQuery(channel, paste("select * from > [biomass_data$] where site_no = ",i,sep="")) > } > > #Here my second try [error message on the line within the loop, > saying something like: > # 'only the first element will be used as variable name'] > for(i in site_data) { > assign(paste("site",i,"_data",sep=""), sqlQuery(channel, > paste("select * from [biomass_data$] where site_no = ",i,sep=""))) > > } I'm guessing one of the site numbers is not just a simple integer but has two numbers with a space or something in it. > > I would be very, very glad if anybody sends me a clue about this, as > I'm a obviuos Newbie using > R... > > Thanks, > > Katharina > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. Steven McKinney, Ph.D. Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: smckin...@bccrc.ca tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] what is R equivalent of Fortran DOUBLE PRECISION ?
>From the help page for "Foreign" > ?Foreign Numeric vectors in R will be passed as type double * to C (and as double precision to Fortran) Also, see the help page for "double" > ?double R has no single precision data type. All real numbers are stored in double precision format. The functions as.single and single are identical to as.double and double except they set the attribute Csingle that is used in the .C and .Fortran interface, and they are intended only to be used in that context. HTH Steven McKinney, Ph.D. Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: smckinney +at+ bccrc +dot+ ca tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada -Original Message- From: r-help-boun...@r-project.org on behalf of mau...@alice.it Sent: Mon 3/30/2009 3:07 AM To: r-help@r-project.org Cc: John C. Nash Subject: [R] what is R equivalent of Fortran DOUBLE PRECISION ? I noticed taht R cannot understand certain Fortran real constant formats. For instance: c14<- as.double( 7.785205408500864D-02) Error: unexpected symbol in " c14<- as.double( 7.785205408500864D" The above "D" is used in Fortran language to indicate the memory starage mode. That is for instructing Fortran compiler to store such a REAL constant in DOUBLE PRECISION... am I right ? Since R cannot undestand numerical conatant post-fixed by the letter "D", I wonder how I can instruct R interpreter to store such a numerical constant reserving as muh memory as necessary so as to accommodate a double precision number. I noticed R accepts the folllowing syntax but I do not know if i have achieved my goal thsi way: > c14<- as.double( 7.785205408500864E-02) > typeof(c4) [1] "double" My questions are: what is the best precision I can get with R when dealing with real number ? Is R "double" type equvalent to Fortran DOUBLE PRECISION for internal number representation ? Thank you very much. Maura tutti i telefonini TIM! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Combining columns from two dataframes
Not sure how well this gets at what you want, but it seems close (if not kludgy!) > c <- data.frame(t(cbind(t(a), t(b))), stringsAsFactors = FALSE)[, 1, drop = > FALSE] > c a 1 2008-07-27 2 2008-10-01 3 2008-08-15 4 2008-08-14 5 2008-08-14 6 2008-09-20 7 2008-07-27 8 2008-10-01 > c$a <- as.Date(c$a) > c a 1 2008-07-27 2 2008-10-01 3 2008-08-15 4 2008-08-14 5 2008-08-14 6 2008-09-20 7 2008-07-27 8 2008-10-01 > lapply(c, class) $a [1] "Date" > Steven McKinney, Ph.D. Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: smckinney +at+ bccrc +dot+ ca tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada -Original Message- From: r-help-boun...@r-project.org on behalf of Andrew McFadden Sent: Tue 3/17/2009 2:55 PM To: r-help@r-project.org Subject: [R] Combining columns from two dataframes I all I am trying to combine columns from two dataframes to make a completely new dataframe consisting of one column of dates (ie a combination of dates from the two dataframes). >From the following dataframes a b 1 2008-07-27 1 2 2008-10-01 2 3 2008-08-15 3 4 2008-08-14 4 5 2008-08-14 5 6 2008-09-20 6 c d 1 2008-07-27 1 2 2008-10-01 2 I would like to get: z 2008-07-27 2008-10-01 2008-08-15 2008-08-14 2008-08-14 2008-09-20 2008-07-27 2008-10-01 a<- data.frame(as.Date(c("2008-07-27","2008-10-01","2008-08-15","2008-08-14" , "2008-08-14","2008-09-20"), format = "%Y-%m-%d"),c(1:6)) names(a)=c("a","b") a b<- data.frame(as.Date(c("2008-07-27","2008-10-01"), format = "%Y-%m-%d"),c(1:2)) names(b)=c("c","d") a;b I have tried: append(a,b) combine(a,b) rbind(a,b) I would appreciate your help. Kind regards andy Andrew McFadden MVS BVSc Incursion Investigator Investigation & Diagnostic Centres - Wallaceville Biosecurity New Zealand Ministry of Agriculture and Forestry Phone 04 894 5600 Fax 04 894 4973 Mobile 029 894 5611 Postal address: Investigation and Diagnostic Centre- Wallaceville Box 40742 Ward St Upper Hutt This email message and any attachment(s) is intended solely for the addressee(s) named above. The information it contains is confidential and may be legally privileged. Unauthorised use of the message, or the information it contains, may be unlawful. If you have received this message by mistake please call the sender immediately on 64 4 8940100 or notify us by return email and erase the original message and attachments. Thank you. The Ministry of Agriculture and Forestry accepts no responsibility for changes made to this email or to any attachments after transmission from the office. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How do I set the Windows temporary directory in R?
Check help for tempfile() and tempdir() You can set an environment variable as discussed in the help to point to D:\temp >From ?tempfile: By default, the filenames will be in the directory given by tempdir(). This will be a subdirectory of the temporary directory found by the following rule. The environment variables TMPDIR, TMP and TEMP are checked in turn and the first found which points to a writable directory is used: if none succeeds /tmp is used. HTH Steven McKinney, Ph.D. Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: smckinney +at+ bccrc +dot+ ca tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada -Original Message- From: r-help-boun...@r-project.org on behalf of Jonathan Greenberg Sent: Tue 3/17/2009 6:42 PM To: r-help@r-project.org Subject: [R] How do I set the Windows temporary directory in R? I'm trying to redirect where temporary files go under R to D:\temp\somerandomname from its default (C:\Documents and Settings\username\Temp\somerandomname) -- how do I go about doing this? --j -- Jonathan A. Greenberg, PhD Postdoctoral Scholar Center for Spatial Technologies and Remote Sensing (CSTARS) University of California, Davis One Shields Avenue The Barn, Room 250N Davis, CA 95616 Cell: 415-794-5043 AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Is it possible for R to import a SigmaPlot file?
Hi Jason Have you checked out Stat/Transfer? www.stattransfer.com They state they handle SYSTAT files, and SigmaPlot is a SYSTAT product. Stat/Transfer has a demo download you could use to test. I've used it to good effect (though for other than SYSTAT datasets) in the past. HTH Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: smckinney +at+ bccrc +dot+ ca tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada -Original Message- From: r-help-boun...@r-project.org on behalf of Jason Rupert Sent: Thu 1/22/2009 2:58 PM To: r-help@r-project.org Subject: [R] Is it possible for R to import a SigmaPlot file? I recently received a Sigmaplot file (*.jnb) from a customer and would like to know if I can input it to a data frame and then manipulate the data in R. I did a search on Google and on RSeek (www.rseek.org), but did not get any good hits. Thank for any feedback and insight you can provide. P.S. Love the flexibility of R and would love to keep using it. Just wanting to know if this is possible. Thanks again. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with subset() function?
D'oh! My apologies for the noise. I thought I had verified class from the str() output the user was showing me. > class(subset(mydf, ht >= 150.0 & wt <= 150.0, select = c(age))) [1] "data.frame" > class(subset(mydf, ht >= 150.0 & wt <= 150.0, select = c(age), drop = TRUE)) [1] "integer" > class(mydf[mydf$ht >= 150.0 & mydf$wt <= 150.0, "age"]) [1] "integer" > density(subset(mydf, ht >= 150.0 & wt <= 150.0, select = c(age), drop = TRUE)) Call: density.default(x = subset(mydf, ht >= 150 & wt <= 150, select = c(age), drop = TRUE)) Data: subset(mydf, ht >= 150 & wt <= 150, select = c(age), drop = TRUE) (29 obs.); Bandwidth 'bw' = 5.816 xy Min. : 4.553 Min. :3.781e-05 1st Qu.:22.776 1st Qu.:3.108e-03 Median :41.000 Median :1.775e-02 Mean :41.000 Mean :1.370e-02 3rd Qu.:59.224 3rd Qu.:2.128e-02 Max. :77.447 Max. :2.665e-02 > It's the "drop" arg that differs between density(subset(mydf, ht >= 150.0 & wt <= 150.0, select = c(age))) and density(mydf[mydf$ht >= 150.0 & mydf$wt <= 150.0, "age"]) so it is subset(mydf, ht >= 150.0 & wt <= 150.0, select = c(age), drop = TRUE) that is equivalent to mydf[mydf$ht >= 150.0 & mydf$wt <= 150.0, "age"] Apologies and thanks for setting me straight. Best Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: smckinney +at+ bccrc +dot+ ca tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada -Original Message----- From: Marc Schwartz [mailto:marc_schwa...@comcast.net] Sent: Tue 1/20/2009 3:20 PM To: Steven McKinney Cc: R-help@r-project.org Subject: Re: [R] Problem with subset() function? on 01/20/2009 05:02 PM Steven McKinney wrote: > Hi all, > > Can anyone explain why the following use of > the subset() function produces a different > outcome than the use of the "[" extractor? > > The subset() function as used in > > density(subset(mydf, ht >= 150.0 & wt <= 150.0, select = c(age))) Here you are asking density to be run on a data frame, which is what subset returns, even when you select a single column. Thus, you get an error since density() expects a numeric vector. No bug in either subset() or the documentation. You could do this: density(subset(mydf, ht >= 150.0 & wt <= 150.0, select = age)[[1]]) > appears to me from documentation to be equivalent to > > density(mydf[mydf$ht >= 150.0 & mydf$wt <= 150.0, "age"]) Here you are running density on a vector, so it works. This is because the default behavior for "[.data.frame" has 'drop = TRUE', which means that the returned result is coerced to the lowest possible dimension. Thus, rather than a single data frame column, a vector is returned. The result from subset() would be equivalent to using 'drop = FALSE'. HTH, Marc Schwartz > (modulo exclusion of NAs) but use of the former yields an > error from density.default() (shown below). > > > Is this a bug in the subset() machinery? Or is it > a documentation issue for the subset() function > documentation or density() documentation? > > I'm seeing issues such as this with newcomers to R > who initially seem to prefer using subset() instead > of the bracket extractor. At this point these functions > are clearly not exchangeable. Should code be patched > so that they are, or documentation amended to show > when use of subset() is not appropriate? > >> ### Bug in subset()? > >> set.seed(123) >> mydf <- data.frame(ht = 150 + 10 * rnorm(100), > +wt = 150 + 10 * rnorm(100), > +age = sample(20:60, size = 100, replace = TRUE) > +) > > >> density(subset(mydf, ht >= 150.0 & wt <= 150.0, select = c(age))) > Error in density.default(subset(mydf, ht >= 150 & wt <= 150, select = > c(age))) : > argument 'x' must be numeric > > >> density(mydf[mydf$ht >= 150.0 & mydf$wt <= 150.0, "age"]) > > Call: > density.default(x = mydf[mydf$ht >= 150 & mydf$wt <= 150, "age"]) > > Data: mydf[mydf$ht >= 150 & mydf$wt <= 150, "age"] (29 obs.); Bandwidth 'bw' > = 5.816 > >xy > Min. : 4.553 Min. :3.781e-05 > 1st Qu.:22.776 1st Qu.:3.108e-03 > Median :41.000 Median :1.775e-02 > Mean :41.000 Mean :1.370e-02 > 3rd Qu.:59.224 3rd Qu.:2.128e-02 > Max. :77.447 Max. :2.665e-02 > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with subset() function?
Hi all, Can anyone explain why the following use of the subset() function produces a different outcome than the use of the "[" extractor? The subset() function as used in density(subset(mydf, ht >= 150.0 & wt <= 150.0, select = c(age))) appears to me from documentation to be equivalent to density(mydf[mydf$ht >= 150.0 & mydf$wt <= 150.0, "age"]) (modulo exclusion of NAs) but use of the former yields an error from density.default() (shown below). Is this a bug in the subset() machinery? Or is it a documentation issue for the subset() function documentation or density() documentation? I'm seeing issues such as this with newcomers to R who initially seem to prefer using subset() instead of the bracket extractor. At this point these functions are clearly not exchangeable. Should code be patched so that they are, or documentation amended to show when use of subset() is not appropriate? > ### Bug in subset()? > set.seed(123) > mydf <- data.frame(ht = 150 + 10 * rnorm(100), +wt = 150 + 10 * rnorm(100), +age = sample(20:60, size = 100, replace = TRUE) +) > density(subset(mydf, ht >= 150.0 & wt <= 150.0, select = c(age))) Error in density.default(subset(mydf, ht >= 150 & wt <= 150, select = c(age))) : argument 'x' must be numeric > density(mydf[mydf$ht >= 150.0 & mydf$wt <= 150.0, "age"]) Call: density.default(x = mydf[mydf$ht >= 150 & mydf$wt <= 150, "age"]) Data: mydf[mydf$ht >= 150 & mydf$wt <= 150, "age"] (29 obs.); Bandwidth 'bw' = 5.816 xy Min. : 4.553 Min. :3.781e-05 1st Qu.:22.776 1st Qu.:3.108e-03 Median :41.000 Median :1.775e-02 Mean :41.000 Mean :1.370e-02 3rd Qu.:59.224 3rd Qu.:2.128e-02 Max. :77.447 Max. :2.665e-02 > sessionInfo() R version 2.8.0 Patched (2008-11-06 r46845) powerpc-apple-darwin9.5.0 locale: C attached base packages: [1] stats graphics grDevices datasets utils methods base loaded via a namespace (and not attached): [1] Matrix_0.999375-16 grid_2.8.0 lattice_0.17-15lme4_0.99875-9 [5] nlme_3.1-89 > Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: smckinney +at+ bccrc +dot+ ca tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Compare matrices
Use the is.na() function to assign NA values: > is.na(A) <- !B > A [,1] [,2] [,3] [1,]3 NA NA [2,]333 [3,]33 NA > C <- matrix(c(3,3,3,NA,3,3,NA,3,NA),3,3) > all.equal(A, C) [1] TRUE Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: smckinney +at+ bccrc +dot+ ca tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada -Original Message- From: r-help-boun...@r-project.org on behalf of Dimitris Rizopoulos Sent: Mon 1/19/2009 12:54 PM To: Andrej Kastrin Cc: r-help@r-project.org Subject: Re: [R] Compare matrices try this: A <- matrix(c(3,3,3,3,3,3,3,3,3),3,3) B <- matrix(c(T,T,T,F,T,T,F,T,F),3,3) C <- A C[!B] <- NA C I hope it helps. Best, Dimitris Andrej Kastrin wrote: > Dear all, > > Suppose that I have a matrix A > >A <- matrix(c(3,3,3,3,3,3,3,3,3),3,3) > > and a logical matrix B > >B <- matrix(c(T,T,T,F,T,T,F,T,F),3,3) > > The result matrix should be > >C <- matrix(c(3,3,3,NA,3,3,NA,3,NA),3,3) > > Is there any simple tip or trick to perform this without looping? > > Thanks in advance for any suggestion. > > Best regards, Andrej > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] coercing a list into matrix
Hi Ken, If your lists are not too large something such as this can work: > matrixFromList <- function(listX) t(sapply(listX, function(x, n) c(x, rep(NA, > n))[1:n], n = max(sapply(listX, length > matrixFromList(list(c(3,2,3),c(6,5))) [,1] [,2] [,3] [1,]323 [2,]65 NA > A <- list(c(3,2,3), c(1,2,3,4), c(5,6)) > matrixFromList(A) [,1] [,2] [,3] [,4] [1,]323 NA [2,]1234 [3,]56 NA NA > class(matrixFromList(A)) [1] "matrix" Best Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: smckinney +at+ bccrc +dot+ ca tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada -Original Message- From: r-help-boun...@r-project.org on behalf of Lo, Ken Sent: Wed 1/14/2009 1:50 PM To: r-help@r-project.org Subject: [R] coercing a list into matrix Dear list, I have a list of number sequences. Each number sequence has different numbers of elements. Is there a quick way (other than to iterate through the entire list) way to coerce list to matrix with NAs filling in the short sequences? An example of what I mean is this: A <- list(c(3,2,3),c(6,5)) I'd like to get A so that it is 3 2 3 6 5 NA Best, Ken __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Memory Efficiency of Symmetric Matrix
See also the dist() function documentation. If you use indexing as described in ?dist it is straightforward to maintain and use a vector of the distances. Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: smckinney +at+ bccrc +dot+ ca tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada -Original Message- From: r-help-boun...@r-project.org on behalf of Søren Højsgaard Sent: Tue 1/6/2009 4:36 PM To: Nathan S. Watson-Haigh; r-help@r-project.org Subject: Re: [R] Memory Efficiency of Symmetric Matrix You can do mat[lower.tri(mat, diag=F)] Søren Fra: r-help-boun...@r-project.org på vegne af Nathan S. Watson-Haigh Sendt: on 07-01-2009 01:28 Til: r-help@r-project.org Emne: [R] Memory Efficiency of Symmetric Matrix -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I'm generating a symmetric correlation matrix using a data matrix as input: mat <- cor(data.mat) My question is: Is there a more memory efficient way to store this data? For instance, since: all(mat == t(mat)) every value is duplicated, and I should be able to almost half the memory usage for large matrices. Any thoughts/comments? Cheers, Nathan - -- - Dr. Nathan S. Watson-Haigh OCE Post Doctoral Fellow CSIRO Livestock Industries Queensland Bioscience Precinct St Lucia, QLD 4067 Australia Tel: +61 (0)7 3214 2922 Fax: +61 (0)7 3214 2900 Web: http://www.csiro.au/people/Nathan.Watson-Haigh.html - -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org <http://enigmail.mozdev.org/> iEYEARECAAYFAklj9yAACgkQ9gTv6QYzVL6MGQCg1CHsRGAwEMah/8ZuZ9QFI6O5 lcIAnjZ68DE9FABLMd07A3AfdMPRpXIH =5bet -END PGP SIGNATURE- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems installing lme4 on Ubuntu
This may just be version incompatibilities. A similar discussion recently transpired on r-sig-mixed-models, see e.g. https://stat.ethz.ch/pipermail/r-sig-mixed-models/2008q4/001526.html HTH Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: smckinney +at+ bccrc +dot+ ca tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada -Original Message- From: r-help-boun...@r-project.org on behalf of Bill Harris Sent: Fri 12/19/2008 5:13 PM To: r-help@r-project.org Subject: [R] Problems installing lme4 on Ubuntu -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 While I'm not an R expert, I have used R on Windows XP. Now I've moved to Ubuntu (Intrepid), and I'm trying to configure R to work with the Gelman and Hill _Data Analysis Using Regression and Multilevel/Hierarchical Models_. So far, it's not working. I start by following the instructions for installing arm and BRugs at <http://www.stat.columbia.edu/~gelman/bugsR/>. At first, it kept failing, but I kept installing more Ubuntu packages. Finally, I installed the science-statistics package, and things seemed to work: I got the required packages to show up as loaded when I run library(). When I run library(arm) (FWIW, I'm using ESS in Emacs), I get , | > library(arm) | Loading required package: MASS | Loading required package: Matrix | Loading required package: lattice | | Attaching package: 'Matrix' | | | The following object(s) are masked from package:stats : | |xtabs | | Loading required package: lme4 | Error in dyn.load(file, DLLpath = DLLpath, ...) : | function 'cholmod_start' not provided by package 'Matrix' | Error: package 'lme4' could not be loaded ` even though , | > library() | Packages in library '/home/bill/R/i486-pc-linux-gnu-library/2.7': | | arm Data Analysis Using Regression and | Multilevel/Hierarchical Models | codaOutput analysis and diagnostics for MCMC | Matrix Sparse and Dense Matrix Classes and Methods | R2WinBUGS Running WinBUGS and OpenBUGS from R / S-PLUS | | Packages in library '/usr/lib/R/site-library': | . | . | latticeExtraExtra Graphical Utilities Based on Lattice | lme4Linear mixed-effects models using S4 classes | lmtest Testing Linear Regression Models | . | . | | Packages in library '/usr/lib/R/library': | | . | . | . | | Warning message: | In library() : library '/usr/local/lib/R/site-library' contains no packages ` and , | > library(help = arm) | | Information on package 'arm' | | Description: | | Package: arm | Version: 1.1-17 | Date: 2008-11-25 | Title: Data Analysis Using Regression and |Multilevel/Hierarchical Models | Author:Andrew Gelman , Yu-Sung Su |, Masanao Yajima |, Jennifer Hill |, Maria Grazia Pittau |, Jouni Kerman | and Tian Zheng | | Maintainer:Yu-Sung Su | Depends: methods, R (>= 2.6.0), MASS, Matrix (>= 0.999375-10), |lme4 (>= 0.999375-16), R2WinBUGS | Suggests: car, foreign | Description: R functions for processing lm, glm, mer and polr |outputs. | URL: http://www.stat.columbia.edu/~gelman/software/ | License: GPL (>= 2) | Packaged: Tue Nov 25 15:26:57 2008; SUYS | Built: R 2.7.1; ; 2008-12-17 21:31:22; unix | | Index: | | GO-classFunction to Recall Last Source File | balance Functions to compute the balance statistics | bayesglmBayesian generalized linear models. | bayespolr Bayesian Ordered Logistic or Probit Regression | binnedplot Binned Residual Plot | coefplotGeneric Function for Making Coefficient Plot | contr.bayes.ordered Contrast Matrices | corrplotCorrelation Plot | display Functions for Processing lm, glm, mer and polr | Output | fround Formating the Rounding of Numbers | invlogitInverse logistic function | lalonde Lalonde Dataset | matchingSingle Nearest Neighborhood Matching | model.matrix.bayes Construct Design Matrices | multicomp.plot Multiple Comparison Plot | rescale Function for Standardizing by Centering and | Dividing by 2 sd's | residual.plot residual plot for the observed values | se.coef Extract Standard Errors of Model Coefficients | sigma.
Re: [R] How can I draw bars
Hi Raymond, You might be able to get the plots you want with the 'cytoband' material provided in the Bioconductor package "SNPchip". Check out the SNPchip data set 'cytoband' and the plot function "plotCytoband" for ideas. This material is used e.g. in Bioconductor package "VanillaICE" to add a chromosome graphic to the bottom of data plots of genomic data. You can see examples in the vignette PDFs. HTH Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: [EMAIL PROTECTED] tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada > -Original Message- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > On Behalf Of stephen sefick > Sent: Tuesday, December 09, 2008 7:48 PM > To: Raymond Balise > Cc: r-help@r-project.org > Subject: Re: [R] How can I draw bars > > arrows and I don't remember what package, but I will see if I can > remember. > > On Tue, Dec 9, 2008 at 3:02 PM, Raymond Balise <[EMAIL PROTECTED]> > wrote: > > I need to make a graphic to show problems on different parts of > > chromosomes (think of a graphic showing the number of frayed threads > > as colors along different parts of a worn out rope). I want to draw > > bars going from left to right across a page and color different parts > > of the bars in different shades. Each graphic will need to have > > several bars of different lenghts corresponding to the different > > lenghts of the chromsomes. > > > > Is there a package around that can help me draw custom bars of > > different colors? I am a hardcore SAS guy who is just learning R so I > > am mostly clueless. > > > > __ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > > -- > Stephen Sefick > > Let's not spend our time and resources thinking about things that are > so little or so large that all they really do for us is puff us up and > make us feel like gods. We are mammals, and have not exhausted the > annoying little problems of being mammals. > > -K. Mullis > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with reading code
Hi Dana > -Original Message- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > On Behalf Of Dana77 > Sent: Wednesday, December 03, 2008 3:24 PM > To: r-help@r-project.org > Subject: [R] Help with reading code > > > I would like to give out the equation for calculating the maximum > likelihood. > Below is the code, but I still have problems with it. After I read the > code, I found there are two cases for "w(weights)". If "w" is not zero, > then the equation is given as "val <- 0.5 * (sum(log(w)) - N * (log(2 * > pi) > + 1 - log(N) + > log(sum(w * res^2". However, if "w" is zero, then I do not > know > what equation it should be since it does not make any sense for "log0". > Hope > someone can help me to figure this out. Thanks! > > > > > function (object, REML = FALSE, ...) > { > res <- object$residuals > p <- object$rank > N <- length(res) > if (is.null(w <- object$weights)) { > w <- rep.int(1, N) > } > else { > excl <- w == 0 #I can not understand the following lines after this. The command above sets a variable to "exclude" with. Any observation with weight equal to zero will get excluded. excl will have value TRUE for all observations with weight 0. > if (any(excl)) { ### If there are any observations to exclude > res <- res[!excl] ### then keep the ones with weight not zero > N <- length(res) > w <- w[!excl]} so the above chunk keeps only the residuals and weights for the observations with non-zero weights > } > N0 <- N > if (REML) > N <- N - p > val <- 0.5 * (sum(log(w)) - N * (log(2 * pi) + 1 - log(N) + > log(sum(w * res^2 so now there are no more observations with w == 0 in the above equation > if (REML) > val <- val - sum(log(abs(diag(object$qr$qr)[1:p]))) > attr(val, "nall") <- N0 > attr(val, "nobs") <- N > attr(val, "df") <- p + 1 > class(val) <- "logLik" > val > } > > > Best, > > Dana > -- > View this message in context: http://www.nabble.com/Help-with-reading- > code-tp20823979p20823979.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: [EMAIL PROTECTED] tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] AIC function and Step function
Hi Dana, Many thanks to Christos Hatzis who sent me an offline response, pointing out the new functions that make this much easier than my last suggestions: methods() and getAnywhere() > methods("extractAIC") [1] extractAIC.aov* extractAIC.coxph* extractAIC.glm* extractAIC.lm* extractAIC.negbin* [6] extractAIC.survreg* Non-visible functions are asterisked > getAnywhere("extractAIC.coxph") A single object matching extractAIC.coxph was found It was found in the following places registered S3 method for extractAIC from namespace stats namespace:stats with value function (fit, scale, k = 2, ...) { edf <- length(fit$coef) loglik <- fit$loglik[length(fit$loglik)] c(edf, -2 * loglik + k * edf) } > Thank you Christos. That said, one of the advantages of getting the source code is that it has comments that are stripped out when the code is sourced into R e.g. from the command line > getAnywhere(AIC.default) A single object matching AIC.default was found It was found in the following places registered S3 method for AIC from namespace stats namespace:stats with value function (object, ..., k = 2) { ll <- if ("stats4" %in% loadedNamespaces()) stats4:::logLik else logLik if (length(list(...))) { object <- list(object, ...) val <- lapply(object, ll) val <- as.data.frame(t(sapply(val, function(el) c(attr(el, "df"), AIC(el, k = k) names(val) <- c("df", "AIC") Call <- match.call() Call$k <- NULL row.names(val) <- as.character(Call[-1]) val } else AIC(ll(object), k = k) } >From the source file # File src/library/stats/R/AIC.R # Part of the R package, http://www.R-project.org # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # A copy of the GNU General Public License is available at # http://www.r-project.org/Licenses/ Return the object's value of the Akaike Information Criterion (or "An Inf.. Crit..") AIC <- function(object, ..., k = 2) UseMethod("AIC") ## AIC for logLik objects AIC.logLik <- function(object, ..., k = 2) -2 * c(object) + k * attr(object, "df") AIC.default <- function(object, ..., k = 2) { ## AIC for various fitted objects --- any for which there's a logLik() method: ll <- if("stats4" %in% loadedNamespaces()) stats4:::logLik else logLik if(length(list(...))) {# several objects: produce data.frame object <- list(object, ...) val <- lapply(object, ll) val <- as.data.frame(t(sapply(val, function(el) c(attr(el, "df"), AIC(el, k = k) names(val) <- c("df", "AIC") Call <- match.call() Call$k <- NULL row.names(val) <- as.character(Call[-1]) val } else AIC(ll(object), k = k) } Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: smckinney +at+ bccrc +dot+ ca tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] AIC function and Step function
ues ## for loglik edf <- length(fit$coef) loglik <- fit$loglik[length(fit$loglik)] c(edf, -2 * loglik + k * edf) } extractAIC.survreg <- function(fit, scale, k = 2, ...) { edf <- sum(fit$df) c(edf, -2 * fit$loglik[2] + k * edf) } extractAIC.glm <- function(fit, scale = 0, k = 2, ...) { n <- length(fit$residuals) edf <- n - fit$df.residual aic <- fit$aic c(edf, aic + (k-2) * edf) } extractAIC.lm <- function(fit, scale = 0, k = 2, ...) { n <- length(fit$residuals) edf <- n - fit$df.residual RSS <- deviance.lm(fit) dev <- if(scale > 0) RSS/scale - n else n * log(RSS/n) c(edf, dev + k * edf) } extractAIC.aov <- extractAIC.lm extractAIC.negbin <- function(fit, scale, k = 2, ...) { n <- length(fit$residuals) edf <- n - fit$df.residual c(edf, -fit$twologlik + k * edf) } HTH Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: smckinney +at+ bccrc +dot+ ca tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada -Original Message- From: [EMAIL PROTECTED] on behalf of Dana77 Sent: Thu 11/27/2008 6:11 PM To: r-help@r-project.org Subject: [R] AIC function and Step function I would like to figure out the equations for calculating "AIC" in both "step() function" and "AIC () function". They are different. Then I just type "step" in the R console, and found the "AIC" used in "step() function" is "extractAIC". I went to the R help, and found: "The criterion used is AIC = - 2*log L + k * edf, where L is the likelihood and edf the equivalent degrees of freedom (i.e., the number of free parameters for usual parametric models) of fit. For linear models with unknown scale (i.e., for lm and aov), -2log L is computed from the deviance and uses a different additive constant to logLik and hence AIC. If RSS denotes the (weighted) residual sum of squares then extractAIC uses for - 2log L the formulae RSS/s - n (corresponding to Mallows' Cp) in the case of known scale s and n log (RSS/n) for unknown scale. AIC only handles unknown scale and uses the formula n log (RSS/n) - n + n log 2? - sum log w where w are the weights." Now, my question is what code I should use to look at the exact calculation process in the AIC()function and extractAIC() function in R? Thanks! Dana -- View this message in context: http://www.nabble.com/AIC-function-and-Step-function-tp20728043p20728043.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 64bit R for Mac
Dear Marco Check out http://r.research.att.com/ and see if the R 2.8.0 Patched (2008-11-21)tiger R-2.8-branch-n.dmg works for you (n = 47002 at this writing - changes every now and then). I'm no longer running Tiger so haven't tried this directly, but the Leopard version is working well. You might also want to follow the r-sig-mac list for mac-specific information. HTH Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: [EMAIL PROTECTED] tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada > -Original Message- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > On Behalf Of Blanchette, Marco > Sent: Monday, November 24, 2008 6:34 AM > To: R-help > Subject: [R] 64bit R for Mac > > Dear R gurus, > > On the CRAN website, it says that a 64bit version for Mac OS Tiger would > be release shortly. Do we know what are the expected dates? Will the > packages be also compiled for 64bit? > > We are running large microarray analysis and we keep hitting the 3Gb > memory limit. > > I saw that there is a version available on the development mirrors, but I > am not too excited to replace our very stable and reliable 32bit version > with a 64bit binary that might not be that stable and with packages that > would need to be 64bit compiled on site... > > Cheers > -- > Marco Blanchette, Ph.D. > Assistant Investigator > Stowers Institute for Medical Research > 1000 East 50th St. > > Kansas City, MO 64110 > > Tel: 816-926-4071 > Cell: 816-726-8419 > Fax: 816-926-2018 > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 64bit R for Mac
(Apologies if this repost is a duplicate, my first submission did not appear to make it through.) -Original Message- From: Steven McKinney Sent: Mon 11/24/2008 4:28 PM To: 'Blanchette, Marco'; R-help Subject: RE: [R] 64bit R for Mac Dear Marco Check out http://r.research.att.com/ and see if the R 2.8.0 Patched (2008-11-21)tiger R-2.8-branch-n.dmg works for you (n = 47002 at this writing - changes every now and then). I'm no longer running Tiger so haven't tried this directly, but the Leopard version is working well. You might also want to follow the r-sig-mac list for mac-specific information. HTH Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: [EMAIL PROTECTED] tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada > -Original Message- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > On Behalf Of Blanchette, Marco > Sent: Monday, November 24, 2008 6:34 AM > To: R-help > Subject: [R] 64bit R for Mac > > Dear R gurus, > > On the CRAN website, it says that a 64bit version for Mac OS Tiger would > be release shortly. Do we know what are the expected dates? Will the > packages be also compiled for 64bit? > > We are running large microarray analysis and we keep hitting the 3Gb > memory limit. > > I saw that there is a version available on the development mirrors, but I > am not too excited to replace our very stable and reliable 32bit version > with a 64bit binary that might not be that stable and with packages that > would need to be 64bit compiled on site... > > Cheers > -- > Marco Blanchette, Ph.D. > Assistant Investigator > Stowers Institute for Medical Research > 1000 East 50th St. > > Kansas City, MO 64110 > > Tel: 816-926-4071 > Cell: 816-726-8419 > Fax: 816-926-2018 > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to test for the empty set
Dear Jay, length(x) will return zero for all these (vector) examples, and is quite readable in code. (You might have seen code such as if(length(x)) { blah } which uses this idea.) Does this cover all your use cases? > x <- c() > class(x) [1] "NULL" > length(x) [1] 0 > y <- letters[1:3] > z <- letters[4:6] > x <- intersect(y,z) > class(x) [1] "character" > length(x) [1] 0 > y <- 1:3 > z <- 4:6 > x <- intersect(y,z) > class(x) [1] "integer" > length(x) [1] 0 > HTH Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: smckinney +at+ bccrc +dot+ ca tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada -Original Message- From: [EMAIL PROTECTED] on behalf of G. Jay Kerns Sent: Mon 11/24/2008 9:41 AM To: r-help@r-project.org Subject: [R] how to test for the empty set Dear R-help, I first thought that the empty set (for a vector) would be NULL. x <- c() x However, the documentation seems to make clear that there _many_ empty sets depending on the vector's mode, namely, numeric(0), character(0), logical(0), etc. This is borne out by y <- letters[1:3] z <- letters[4:6] intersect(y,z) which, of course, is non-NULL: is.null(character(0)) # FALSE So, how can we test if a vector is, say, character(0)? The following doesn't (seem to) work: x <- character(0) x == character(0) # logical(0) More snooping led to the following: wiki.r-project.org/rwiki/doku.php?id=tips:surprises:emptysetfuncs and at the bottom of the page it says "logical(0) is an empty set, thus is TRUE". However, I get isTRUE(logical(0)) # FALSE but, on the other hand, all.equal(x, character(0)) # TRUE This would seem to be the solution, but am I missing something? and in particular, is there an elegant way to check in the case that the mode of the vector is not already known? Thanks in advance for any insight you may have. Best, Jay *** G. Jay Kerns, Ph.D. Associate Professor Department of Mathematics & Statistics Youngstown State University Youngstown, OH 44555-0002 USA Office: 1035 Cushwa Hall Phone: (330) 941-3310 Office (voice mail) -3302 Department -3170 FAX E-mail: [EMAIL PROTECTED] http://www.cc.ysu.edu/~gjkerns/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lmer p-values for fixed effects missing
> -Original Message- > From: [EMAIL PROTECTED] on behalf of UKaraoz > Sent: Tue 11/18/2008 2:16 PM > To: r-help@r-project.org > Subject: [R] lmer p-values for fixed effects missing > > > I am trying to replicate the repeated measures example from Dr.Faraway's book > (Extending the linear model with R) as follows: > > data(vision) > vision$npower <- rep(1:4,14) > > mmod <-lmer(acuity~power+(1|subject)+(1|subject:eye),vision) > > > When I look at the fixed effects p-value, it is missing. Am I missing > something here? > Thanks. > > Fixed effects: > Estimate Std. Error t value > (Intercept) 112.6429 2.2348 50.40 > power6/18 0.7857 1.54000.51 > power6/36-1. 1.5400 -0.65 > power6/60 3.2857 1.54002.13 > See the discussion titled "lmer, p-values and all that" at https://stat.ethz.ch/pipermail/r-help/2006-May/094765.html and a collection of related discussions at http://wiki.r-project.org/rwiki/doku.php?id=guides:lmer-tests In a nutshell: The complexity of fully crossed mixed effects models makes degrees of freedom and other model attributes more difficult to calculate compared to fixed effects models. Research and debate is ongoing as to the best way to handle model assessment, not all classical summary statistics and associated p-values map to lmer models in a straightforward fashion. Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: smckinney +at+ bccrc +dot+ ca tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada > > -- > View this message in context: > http://www.nabble.com/lmer-p-values-for-fixed-effects-missing-tp20569114p20569114.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lines.formula with subset
> -Original Message- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > On Behalf Of John Field > Sent: Thursday, November 13, 2008 4:41 PM > To: r-help@r-project.org > Subject: [R] lines.formula with subset > > Dear list, > > When I try to use lines.formula with subset and another argument I > get an error. > e.g. > > x<-1:5 > y<-c(1,3,NA,2,5) > plot(y~x, type="n") # set up frame > lines(y~x, subset=!is.na(y)) # works OK > lines(y~x, type="o", col="blue") # works OK > # but > lines(y~x, subset=!is.na(y), col="red") # gives an error: > > Error in if (length(x) == l) x[s] else x : argument is of length zero > > Why does this happen? > It happens because the function graphics:::lines.formula tries to assess the number of rows of data in the data frame containing the variables in the formula y~x (see the line of code l <- nrow(data) in graphics:::lines.formula This is the 'el' in the 'length(x) == l' portion of the line you see in the error message) Because you did not provide the data frame, nrow(data) returns NULL, and thus the if() clause is 'length(x) == NULL' which yields answer logical(0), an invalid answer in an if() clause. Done this way, all is well: mydf <- data.frame(x = 1:5, y = c(1,3,NA,2,5)) plot(y~x, type="n", data = mydf) # set up frame lines(y~x, subset=!is.na(y), data = mydf) # works OK lines(y~x, type="o", col="blue", data = mydf) # works OK lines(y~x, subset=!is.na(y), col="red", data = mydf) # works OK The formula - based functions expect to see a dataframe object from the 'data' arg, but don't enforce this in this case. This may qualify as a logical bug in the graphics:::lines.formula function. An error should have been thrown before the if() clause evaluation, but I'm not sure where in the chain of function calls the check for a valid data object should be done and the error thrown. Otherwise, the data objects y and x that you set up should have been passed downwards in some fashion for evaluation. R-core members who know the rules better than I will have to determine how best to handle this one. HTH Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: [EMAIL PROTECTED] tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada > With thanks, > John > > = > John Field Consulting Pty Ltd > 10 High Street, Burnside SA 5066 > Phone 08 8332 5294 or 0409 097 586 > Fax 08 8332 1229 > Email [EMAIL PROTECTED] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] bizarre axes with xyplot, problem in data?
> If anyone is bored I am still interested in why R > orders lattice plots from bottom to top. The idea was that, just as small values of x are typically found to the left end of the X axis and small values of y are typically found at the bottom end of the Y axis; a trellis plot corresponding to a small value of the conditioning variable should appear to the lower left of the trellis plot set, and a trellis plot corresponding to a large value of the conditioning variable should be found to the upper right of the trellis plot set. Sounds like wise graphical layout policy, but is surprisingly confusing in many cases. Thankfully Deepayan Sarkar has included options to allow alternate layouts. (Reference: Becker, Cleveland, Shyu, Kaluzny http://www.research.att.com/areas/stat/doc/95.12.color.ps Section 2.6 Layout "Remember that trellis displays are filled as graphs, from the origin in the lower left, not top-down as in a table." ) Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: smckinney +at+ bccrc +dot+ ca tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada -Original Message- From: [EMAIL PROTECTED] on behalf of Phillip Porter Sent: Thu 10/30/2008 11:43 AM To: r-help@r-project.org Subject: Re: [R] bizarre axes with xyplot, problem in data? My SAS brain was still plugged in. I had a missing data point entered as a "." I didn't think anything of it since SAS treats that as missing data. Apparently it confuses R. I deleted the "." re-imported the data and everything was beautiful. If anyone is bored I am still interested in why R orders lattice plots from bottom to top. Thanks, Phillip On Thu, Oct 30, 2008 at 1:25 PM, Phillip Porter <[EMAIL PROTECTED]>wrote: > Good Morning, > I am using xyplot to show two variables for each of several subjects as > follows: > xyplot(y~x|as.factor(ID), type="b", layout=c(7,9), > scales=list(x=list(tick.number=3), y=list(tick.number=5))) > > This is almost exactly the code I used for an earlier project, the only > change is the number of ticks, but I'm getting all kinds of craziness on my > Y axis. I played around with everything, and by only using the first 100 or > so subjects I can get everything to work out right, but when I run the whole > dataset my Y axis goes crazy. When I do 1,1 in layout and stretch the > window the full height of my screen it looks like it is starting at 100 and > going up to the top value of my Y data, and then continuing on the Y values > lower than 100, but everything is overlapped so I'm not quite sure. > > Is the problem in my data? (read.csv from an excel file) Is it in the range > of my data? Is it some little detail I'm leaving out of my code? > > Thanks, > Phillip > > P.S. This isn't very important, but I am curious and maybe one of you > knows, why does R start from the bottom and go up when doing the lattice > plots? My first subjects are at the bottom of the page, and move higher as > you move up the page. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Computational problems in R
I misinterpreted the suggestion from Duncan Murdoch, and I think Amy did too, judging from her comment about B possibly being negative. Duncan Murdoch kindly explained the idea off-list, many thanks to him for taking the time. To avoid further confusion, I've set up a basic example explaining the suggestion. Suppose this is the sum sum(exp(c_i/d)) i = 1, ..., n that needs evaluating: exp(-71) + exp(12) + exp(842) + exp(2100) + exp(141) + exp(-710) The maximum term in the sum is exp(2100) so in the notation of the identity A+B = A*(1 + B/A) = exp(log(A) + log(1 + B/A)) A = exp(2100) B = exp(-71) + exp(12) + exp(842) + exp(141) + exp(-710) so sum(exp(c_i/d)) i = 1, ..., n is exp(2100) * ( 1 + ((exp(-71) + exp(12) + exp(842) + exp(141) + exp(-710))/exp(2100)) ) = exp(2100) * ( 1 + exp(-71 - 2100) + exp(12 - 2100) + exp(842 - 2100) + exp(141 - 2100) + exp(-710 - 2100) ) and of course all the terms exp(-71 - 2100) exp(12 - 2100) exp(842 - 2100) exp(141 - 2100) exp(-710 - 2100) are very close to zero, so the sum sum(exp(c_i/d)) i = 1, ..., n is exp(2100) or 2100 on the log scale. The terms in B are of course never negative as they are all exponential quantities greater than zero, so that's not an issue in this calculation (this 'B' not being the same as the 'B' in Amy's original description). Thanks again to Duncan Murdoch, and my apologies for my misinterpretation. Steven McKinney -Original Message- From: Robin Hankin [mailto:[EMAIL PROTECTED] Sent: Tue 10/28/2008 12:58 AM To: A.Noufaily Cc: Duncan Murdoch; r-help@r-project.org; [EMAIL PROTECTED]; Steven McKinney; Xiaoxu LI Subject: Re: [R] Computational problems in R Hello. The Brobdingnag package uses that identity for a logarithmic representation and also has a hack for negative numbers. HTH rksh A.Noufaily wrote: > > Many thanks for your suggestions... > > I am still checking which one is the most useful for my simulations. > > Concerning using logs, this might be very helpful, but I am not sure if > I can use the following: > A+B = A*(1 + B/A) > = exp(log(A) + log(1 + B/A)) > because unfortunately B can be negative. > However, I might still use logs in case (1 + B/A)>0. > > Regards, > > Amy > > -Original Message- > From: Duncan Murdoch [mailto:[EMAIL PROTECTED] > Sent: Saturday, October 25, 2008 11:36 AM > To: Steven McKinney > Cc: A.Noufaily; r-help@r-project.org > Subject: Re: [R] Computational problems in R > > On 24/10/2008 9:50 PM, Steven McKinney wrote: > >> I suspect there's a deeper issue here. >> sum(exp(yi)) when large yi occur is >> problematic. exp(yi) for yi>710 is >> just a huge number, and summing additional values only makes the >> overall sum larger as all components of the summation are positive. >> There's no way around that. >> > > Sure there is, and you quoted it below. Work on a log scale. The log > of exp(yi) is yi, and it sounds as though the yi values are manageable. > > You might end up knowing that the log of the final answer is 2 and > not be able to evaluate exp(2) in R, but you still know that the > answer is exp(2). > > Duncan Murdoch > >> You could try this with Robin Hankins' >> package "brobdingnag" which can handle bunches of bizarrely large >> numbers. >> >> What kind of process are you studying? >> What kind of process generates values >> such as exp(8/0.01) when other values >> are much smaller? Are these outliers >> in an otherwise well-behaved >> data set? Perhaps then they need >> to be set aside and investigated >> separately, etc. >> >> >> Steven McKinney >> >> Statistician >> Molecular Oncology and Breast Cancer Program British Columbia Cancer >> Research Centre >> >> email: smckinney +at+ bccrc +dot+ ca >> >> tel: 604-675-8000 x7561 >> >> BCCRC >> Molecular Oncology >> 675 West 10th Ave, Floor 4 >> Vancouver B.C. >> V5Z 1L3 >> Canada >> >> >> >> >> -Original Message- >> From: [EMAIL PROTECTED] on behalf of Duncan Murdoch >> Sent: Fri 10/24/2008 4:04 PM >> To: A.Noufaily >> Cc: r-help@r-project.org >> Subject: Re: [R] Computational problems in R >> >> On 24/10/2008 12:42 PM, A.Noufaily wrote: >> >>> Dear all, >>> >>> I would be grateful if anyone can help me with the following: >>> >>> My aim is to compute explicitely the sum S=A+B where >>> A=sum(exp(c_i/d)), i=1,...,n; B, c_i, and d are real numbers with >>> -Inf0. >>> The problem is that when c_i/d >710 (for so
Re: [R] Computational problems in R
I suspect there's a deeper issue here. sum(exp(yi)) when large yi occur is problematic. exp(yi) for yi>710 is just a huge number, and summing additional values only makes the overall sum larger as all components of the summation are positive. There's no way around that. You could try this with Robin Hankins' package "brobdingnag" which can handle bunches of bizarrely large numbers. What kind of process are you studying? What kind of process generates values such as exp(8/0.01) when other values are much smaller? Are these outliers in an otherwise well-behaved data set? Perhaps then they need to be set aside and investigated separately, etc. Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: smckinney +at+ bccrc +dot+ ca tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada -Original Message- From: [EMAIL PROTECTED] on behalf of Duncan Murdoch Sent: Fri 10/24/2008 4:04 PM To: A.Noufaily Cc: r-help@r-project.org Subject: Re: [R] Computational problems in R On 24/10/2008 12:42 PM, A.Noufaily wrote: > Dear all, > > I would be grateful if anyone can help me with the following: > > My aim is to compute explicitely the sum S=A+B where A=sum(exp(c_i/d)), > i=1,...,n; > B, c_i, and d are real numbers with -Inf0. > The problem is that when c_i/d >710 (for some i) R is setting > exp(c_i/d) to be equal to +Inf and hence the whole summation S. > So in simple cases where for example c_i=8 (for some i), and d=0.01 the > whole summation is turning out to be infinite. > Is there a way to get round that in R? > Can anyone suggest any computational trick to calculate S when c_i/d>710 > (for some i)? Work on a log scale. Use the identity that A+B = A*(1 + B/A) = exp(log(A) + log(1 + B/A)) (where you chose A to be the biggest term in the sum). Duncan Murdoch > > Any suggestions would be much appreciated. > > Regards, > > Amy > > > > > > - > The Open University is incorporated by Royal Charter (RC 000391), an exempt > charity in England & Wales and a charity registered in Scotland (SC 038302). > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] combining data from different datasets
If you are using regular R graphs (i.e. not lattice or other library graphics) try setting the margins with the mar argument to par() e.g. par(mar = c(5, 10, 5, 1)) The four numbers specify the amount of margin room on the bottom, left, top, right respectively. Set the left margin value large enough to give your labels enough room. HTH Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: smckinney +at+ bccrc +dot+ ca tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada -Original Message- From: [EMAIL PROTECTED] on behalf of Dr Eberhard W Lisse Sent: Fri 10/24/2008 2:44 PM To: R-help Mailing List Cc: Dr Eberhard W Lisse Subject: Re: [R] combining data from different datasets It's Complicated® :-)-O I pull the data from a postgresql table, but I am getting there, thank you the help. Another question, I am barplotting the continents horizontally, ie the more participants the lrger the bars are. I have managed the make the labels (names of the nontinents) to be horizontally but the longer ones (North and South America) flow off the left side (right justified). So, how do I make the plot smaller, and move it to the right so that the complete names appear? I can make the plot smaller in inches but that doesn't scale if I enlarge the window. Pointers to read will be fine, but so far I haven't found it :-)-O el On 24 Oct 2008, at 18:24 , Gabor Grothendieck wrote: > On Fri, Oct 24, 2008 at 11:37 AM, Dr Eberhard W Lisse <[EMAIL PROTECTED]> > wrote: >> >> This looks very cool. >> >> But I must still make a plan with regards to country = "NA" (Namibia) >> or continent = "NA" (North America) >> >> But there are the vignettes. >> >> el >> > > NA and "NA" are not the same: > >> DF <- data.frame(x = c("a", "NA", NA)) >> DF > x > 1a > 2 NA > 3 >> >> is.na(NA) > [1] TRUE >> is.na("NA") > [1] FALSE > -- Dr. Eberhard W. Lisse \/ Obstetrician & Gynaecologist (Saar) [EMAIL PROTECTED] el108-ARIN / * | Telephone: +264 81 124 6733 (cell) PO Box 8421 \ / Please send DNS/NA-NiC related e-mail Bachbrecht, Namibia ;/ to [EMAIL PROTECTED] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help regarding levels
Your column of data has some character data in it, perhaps an Excel #VALUE! or a blank or some such entry not strictly numeric. When R reads in such a column, it makes that column variable into a 'factor' variable instead of a numeric variable, because the values are not all numeric. You can specify stringsAsFactors = FALSE to force R to leave character data as is if you do not want conversion to factors. If there is Excel cruft in a variable column, it will end up as a factor or character variable. You can try to make it numeric with df$x <- as.numeric(as.character(df$x)) (assuming you read the data into a dataframe called df) and R will replace the Excel cruft with NA values. Alternatively you can clean up the Excel data before importing it. HTH Steve McKinney -Original Message- From: [EMAIL PROTECTED] on behalf of K3 Sent: Fri 10/3/2008 4:09 PM To: r-help@r-project.org Subject: [R] help regarding levels When i try to extract a column of data from an excel file and assign it to a variable , say x, it does assign the column of data as well as different levels. it looks something like this when i print. [1] 6.91 5.89 7.44 8.82 80 Levels: 1.43 102.07 103.65 106.21 106.24 107.15 108.58 11.19 ... so how does this levels come into picture and what do they do. I couldnot run linear regression with x as a predictor just because of this levels. please explain. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] print.data.frame : row.name = FALSE not having intendedeffect
Here's the current method for printing a data frame: > print.data.frame function (x, ..., digits = NULL, quote = FALSE, right = TRUE, row.names = TRUE) { n <- length(row.names(x)) if (length(x) == 0L) { cat("NULL data frame with", n, "rows\n") } else if (n == 0L) { print.default(names(x), quote = FALSE) cat("<0 rows> (or 0-length row.names)\n") } else { m <- as.matrix(format.data.frame(x, digits = digits, na.encode = FALSE)) if (!isTRUE(row.names)) dimnames(m)[[1]] <- if (identical(row.names, FALSE)) rep.int("", n) else row.names print(m, ..., quote = quote, right = right) } invisible(x) } > so you can either set up your own copy of this or make a matrix copy of your data frame (as done in the function above) and set the row names to "" HTH Steve McKinney -Original Message- From: [EMAIL PROTECTED] on behalf of Matthew Pettis Sent: Tue 9/23/2008 6:55 PM To: r-help@r-project.org Subject: Re: [R] print.data.frame : row.name = FALSE not having intendedeffect Never mind -- the answer is buried in my own question... I was looking at documentation for version 2.7.2, and when I looked at the one for 2.6.2, I see the row.names option isn't in that release. Any suggestions on how I can code around that in 2.6.2, so I don't have to upgrade to 2.7.2 just yet? Thanks, Matt On Tue, Sep 23, 2008 at 8:49 PM, Matthew Pettis <[EMAIL PROTECTED]> wrote: > Hi, > > Does anybody know if row.name = FALSE actually works in v2.6.2? > Because it isn't working for me... here is some sample code and > output: > > ++++++++++ >> print(x,row.names = FALSE) > party_abbr candidate_name votes_candidate > 2 DFLAMY KLOBUCHAR 1,278,849 > 5 R MARK KENNEDY 835,653 > 4 IPROBERT FITZGERALD 71,194 > 3 GP MICHAEL JAMES CAVLAN 10,714 > 1 CP BEN POWERS 5,408 > 10 WI WRITE-IN** 901 > 8 WI PETER IDUSOGIE** 29 > 6 WICHARLES ALDRICH** 15 > 9 WI REBECCA WILLIAMSON** 5 > 7 WI JOHN ULDRICH** 4 >> > ++++++++++ > > Thanks, > Matt > > -- > It is from the wellspring of our despair and the places that we are > broken that we come to repair the world. > -- Murray Waas > -- It is from the wellspring of our despair and the places that we are broken that we come to repair the world. -- Murray Waas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sort a data matrix by all the values and keep the names
Is something missing in the melt()? > x<-data.frame(x1=c(1,7),x2=c(4,6),x3=c(8,2)) > require("reshape") Loading required package: reshape > dfm <- melt(x, id = c()) Error in if (!missing(id.var) && !(id.var %in% varnames)) { : missing value where TRUE/FALSE needed > dfm[order(dfm$value), ] Error: object "dfm" not found > x x1 x2 x3 1 1 4 8 2 7 6 2 > melt(x, id = c()) Error in if (!missing(id.var) && !(id.var %in% varnames)) { : missing value where TRUE/FALSE needed > Steve McKinney -Original Message- From: [EMAIL PROTECTED] on behalf of hadley wickham Sent: Mon 9/22/2008 5:47 PM To: zhihuali Cc: [EMAIL PROTECTED] Subject: Re: [R] sort a data matrix by all the values and keep the names On Mon, Sep 22, 2008 at 6:54 PM, zhihuali <[EMAIL PROTECTED]> wrote: > > Dear all, > > If I have a data frame x<-data.frame(x1=c(1,7),x2=c(4,6),x3=c(8,2)): > x1 x2 x3 > 1 4 8 > 7 6 2 > > I want to sort the whole data and get this: > x1 1 > x3 2 > x2 4 > x2 6 > x1 7 > x3 8 > > If I do sort(X), R reports: > Error in order(list(x1 = c(1, 7), x2 = c(4, 6), x3 = c(8, 2)), decreasing = > FALSE) : > unimplemented type 'list' in 'orderVector1' > > The only way I can sort all the data is by converting it to a matrix: >> sort(as.matrix(x)) > [1] 1 2 4 6 7 8 > > But now I lost all the names attributes. > > Is it possible to sort a data frame and keep all the names? Here's one way: dfm <- melt(x, id = c()) dfm[order(dfm$value), ] Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to keep up with R?
One thing that R does very well that SAS does not is graphics - graphical portrayal of data is important, and you can keep up with R by supplementing your SAS analyses with R graphics. Steve McKinney -Original Message- From: [EMAIL PROTECTED] on behalf of Barry Rowlingson Sent: Fri 9/19/2008 12:36 AM To: Wensui Liu Cc: r-help Subject: Re: [R] how to keep up with R? 2008/9/19 Wensui Liu <[EMAIL PROTECTED]>: > Dear Listers, > > I've been a big fan of R since graduate school. After working in the > industry for years, I haven't had many opportunities to use R and am mainly > using SAS. However, I am still forcing myself really hard to stay close to R > by reading R-help and books and writing R code by myself for fun. But by and > by, I start realizing I have hard time to keep up with R and am afraid that > I would totally forget how to program in R. > > I really like it and am very unwilling to give it up. Is there any idea how > I might keep touch with R without using it in work on daily basis? I really > appreciate it. > How about doing some kind of presentation on R at your work? It's possible that some of the old fossils don't even know about it at all, and use SAS because to them the alternative is SPSS. Do some R evangelization. Find a task that R does better than SAS (not difficult) and illustrate that to your superiors. Then when they ask how much a corporate R license is, you tell them it's free, or say it'll cost them a 2% raise in your salary, or say it will cost them your resignation if you are feeling brave! Sure you may be tied to SAS for some other reasons, but there's no reason why you can't use R for other things. Work out how to get it into your corporate framework. Encourage your colleagues to look at it for their tasks. Enthuse. The good thing about training and evangelization is that at first you don't need mad skillz at R to do it. I have trouble understanding some of the tips on R-help (especially when do.call() is used), but you can teach new people with a good knowledge of the basics, which you should still have. Eventually the hope is that enough people use R at your workplace to develop a community where everyone keeps everyone else on their toes with R questions! Good luck! Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Different PCA results under Windows and Linux
Not likely that anyone can explain, as there is not enough information in your email. Including the contents of the freqtest.txt file was a good idea, as the posting guide suggests (the posting guide is that clearly labeled bit at the bottom that looks like this: PLEASE do read the posting guide http://www.R-project.org/posting-guide.html Check it out! It is cool.) Additionally, include the command sessionInfo() and its output from all machines you refer to so maintainers know which versions of software you are running. Also, include the output you obtained from your code (with your code being a self-contained and reproducible set of R commands). Finally, describe what the difference is and why the difference is problematic (i.e. don't report machine precision differences, or sign differences for PCA results - PCA vector directions are arbitrary modulo 180 degrees). > I also tried mean(xrcc2) and sd(xrcc2) on both machines, the results are the > same. > Please explain. The R maintainers do an amazing job of creating numerically stable platform-independent software, so you get the same results almost everywhere. (Thank you R core!) HTH Steve McKinney -Original Message- From: [EMAIL PROTECTED] on behalf of jathine Sent: Tue 9/16/2008 2:19 PM To: r-help@r-project.org Subject: [R] Different PCA results under Windows and Linux I ran the following R script under both Linux and Windows, and got 2 different results. Linux R version 2.7.1 and Windows R version 2.7.2. > library(FactoMineR) >x1=read.table("freqtest.txt",header=TRUE) >xrcc2=x1[,1:8] >p1=PCA(xrcc2, graph=FALSE) >p1$var freqtest.txt file lines of text : M1 M2 M3 M4 M5 M6 M7 M8 -1 -1 -1 -1 -1 -1 -1 -1 0 0 0 0 -1 -1 1 1 -1 -1 -1 -1 -1 -1 -1 -1 0 0 0 0 -1 -1 1 1 I also tried mean(xrcc2) and sd(xrcc2) on both machines, the results are the same. Please explain. -- View this message in context: http://www.nabble.com/Different-PCA-results-under-Windows-and-Linux-tp19520449p19520449.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using quasibinomial family in lmer
Hi Russell, You might join in on the discussion on the R-sig-ME mixed effects listserv (see e.g. Doug Bates' message today (Sept 16 2008) titled Re: [R-sig-ME] glmer and overdispersed Poisson models) There Prof. Bates suggests that older versions of lmer() may be doing things more appropriately (though maybe not yet optimally). (He also asks for input on how this might be addressed - I'm not yet quasi-specialized enough to figure it out but maybe someone reading this will be.) Here's what I get with a slightly older version of lmer - note that my quasi-binomial t-value is not as extreme as yours. Note also that for the output below, the ratio of the standard deviation estimates for the intercept estimate > 0.1219/0.07441 [1] 1.638221 equals the residual standard deviation, so it appears that for the quasibinomial case lmer or some lower level routine estimates the scale parameter phi using the residual standard deviation, though I have not delved into the source code to verify this. So some attempt to modify scale appears to be in place for this version of lmer. library(lme4) set.seed(12) eta=rnorm(50) p=exp(eta)/(1+exp(eta)) y=rbinom(50,20,p)/20 #IID overdispersed binomial-normal proportions #y=rbinom(50,20,0.5)/20 #IID binomial(20,0.5) Group=rep(c("A","B","C","D","E"),c(10,10,10,10,10)) lmer(y~1+(1|Group),weights=rep(20,50),family="binomial") lmer(y~1+(1|Group),weights=rep(20,50),family="quasibinomial") sessionInfo() > library(lme4) > set.seed(12) > eta=rnorm(50) > p=exp(eta)/(1+exp(eta)) > y=rbinom(50,20,p)/20 #IID overdispersed binomial-normal proportions > #y=rbinom(50,20,0.5)/20 #IID binomial(20,0.5) > Group=rep(c("A","B","C","D","E"),c(10,10,10,10,10)) > lmer(y~1+(1|Group),weights=rep(20,50),family="binomial") Generalized linear mixed model fit using Laplace Formula: y ~ 1 + (1 | Group) Family: binomial(logit link) AIC BIC logLik deviance 149.7 153.6 -72.87145.7 Random effects: Groups NameVariance Std.Dev. Group (Intercept) 0.0075745 0.087032 number of obs: 50, groups: Group, 5 Estimated scale (compare to 1 ) 1.638542 Fixed effects: Estimate Std. Error z value Pr(>|z|) (Intercept) -0.137390.07441 -1.847 0.0648 . --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > lmer(y~1+(1|Group),weights=rep(20,50),family="quasibinomial") Generalized linear mixed model fit using Laplace Formula: y ~ 1 + (1 | Group) Family: quasibinomial(logit link) AIC BIC logLik deviance 149.7 153.6 -72.87145.7 Random effects: Groups NameVariance Std.Dev. Group(Intercept) 0.020336 0.14261 Residual 2.684818 1.63854 number of obs: 50, groups: Group, 5 Fixed effects: Estimate Std. Error t value (Intercept) -0.1374 0.1219 -1.127 > sessionInfo() R version 2.7.1 Patched (2008-06-25 r45988) powerpc-apple-darwin8.11.1 locale: C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] RMySQL_0.6-0 DBI_0.2-4 RODBC_1.2-3 lme4_0.99875-9 [5] Matrix_0.999375-9 lattice_0.17-10 loaded via a namespace (and not attached): [1] grid_2.7.1 > Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: smckinney +at+ bccrc +dot+ ca tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada -Original Message- From: [EMAIL PROTECTED] on behalf of [EMAIL PROTECTED] Sent: Tue 9/16/2008 3:18 PM To: r-help@r-project.org Subject: [R] Using quasibinomial family in lmer Dear R-Users, I can't understand the behaviour of quasibinomial in lmer. It doesn't appear to be calculating a scaling parameter, and looks to be reducing the standard errors of fixed effects estimates when overdispersion is present (and when it is not present also)! A simple demo of what I'm seeing is given below. Comments appreciated? Thanks, Russell Millar Dept of Stat U. Auckland PS. I'm using the latest version of lme4 (0.999375-26) with R 2.7.2. > eta=rnorm(50) > p=exp(eta)/(1+exp(eta)) > y=rbinom(50,20,p)/20 #IID overdispersed binomial-normal proportions > #y=rbinom(50,20,0.5)/20 #IID binomial(20,0.5) > > Group=rep(c("A","B","C","D","E"),c(10,10,10,10,10)) > > #library(lme4) > > lmer(y~1+(1|Group),weights=rep(20,50),family="binomial") Generalized linear mixed model fit by the Laplace approximation Formula: y ~ 1 + (1 | Group) AIC BIC logLik deviance 211 214.8 -103.5 207 Random effects: Groups NameVariance Std.Dev. Group (Intercept) 0.072891 0.26998
Re: [R] bootstrapping - number of items to replace is not a multiple of replacement length
Hello, Your theta() function is returning different sets of coefficients depending on the results of step(). You'll need to add code to theta() to figure out which variables were selected, and store them into the right positions of a vector of length 20 (the apparent number of covariates you describe below), so that your theta() function always returns the same sized output. (Google stepwise regression random forest and you'll get a number of hits about using random forests instead of stepwise regression, and pointers about bootstrapping the random forest.) HTH Steve McKinney > -Original Message- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > On Behalf Of Gabriela Bucini > Sent: Tuesday, September 09, 2008 5:02 PM > To: r-help@r-project.org > Subject: [R] bootstrapping - number of items to replace is not a multiple > ofreplacement length > > Hello, > > I'm new to boostrapping and I'd need some help to understand the error > message that pops up when I run my script. > > I have a data.frame with 73 lines and 21 column. > I am running a stepwise regression to find the best model using the R > function "step". > I apply bootstrapping to obtain model coefficients. > This is my script: > > # "datare80" is the name of the data.frame and "woodycover" is the > response > variable > theta <- function(datare80, indices) { > d <- datare80[indices, ]# allows boot to select subsample > datasets > full <- lm(d$woodycover~ ., data= d ) > lmbroadst <- step(full, data=d , direction = "both", k=2, > trace=0) > coefficients(lmbroadst) # return coef. vector > } > resb <- boot(data = datare80, statistic = theta, R=1000) > > > > When I run it, I get these two messages: > If I omit the last line "coefficients(lmbroadst)" in the function > definition, I get : > "Error in t.star[r, ] <- statistic(data, i[r, ], ...) : > incorrect number of subscripts on matrix" > If I have the last line "coefficients(lmbroadst)", then I get: > "Error in t.star[r, ] <- statistic(data, i[r, ], ...) : > number of items to replace is not a multiple of replacement > length" > > > Thank you very much for any help! > > Gabriela > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lattice command equivalent to points
This is close, but maybe not optimal lattice coding. I haven't yet figured out how to suppress the x axis labeling. bwplot(yield ~ 1|year, panel = function(x, y, ...){panel.bwplot(x, y, ..., pch = "|"); panel.points(x, mean(y), ..., pch=17)}, data = barley, horizontal = FALSE, xlab = "") Steve McKinney -Original Message- From: [EMAIL PROTECTED] on behalf of Surai Sent: Tue 9/2/2008 5:53 PM To: r-help@r-project.org Subject: [R] lattice command equivalent to points Hello, This is my first post to R-help so forgive me if I'm committing a fatal error somehow. Feel free to let me know. My question is this: I would like to add a triangle to represent the mean in a box plot generated with the bwplot command. How do I do this? I am able to do this using the boxplot and points command but points does not work with bwplot. If you run the following code in R, you'll see that I'm trying to reproduce graphs from method 2 by modifying code from method 1. Thank you, Surai Jones library(lattice) attach(barley) #method 1 bwplot(yield~year, pch="|") #method 2 boxplot(yield~year) mean.values<-tapply(yield,year, mean) points(1:2, mean.values, pch = 17) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Newbie: Examples on functions callling a library etc.
Hi Ed, Here's a simple example showing your needs: myfun <- function(n1, n2, n3) { mat1 <- matrix(rep(1), nrow = n1, ncol = 3) mat2 <- matrix(rep(2), nrow = n2, ncol = 4) mat3 <- matrix(rep(3), nrow = n3, ncol = 5) require("survival") ## make sure the package you need is loaded mypkgfun <- survival::is.Surv ## Use the '::' and ':::' extractors to get visible and hidden functions respectively from the package list(mat1 = mat1, mat2 = mat2, mat3 = mat3, mypkgfun = mypkgfun) ## Return the items in a list } ## Now invoke the function foo <- myfun(n1 = 1, n2 = 1, n3 = 5) ## and look at the returned results foo > myfun <- function(n1, n2, n3) { + mat1 <- matrix(rep(1), nrow = n1, ncol = 3) + mat2 <- matrix(rep(2), nrow = n2, ncol = 4) + mat3 <- matrix(rep(3), nrow = n3, ncol = 5) + + require("survival") + mypkgfun <- survival::is.Surv + + list(mat1 = mat1, mat2 = mat2, mat3 = mat3, mypkgfun = mypkgfun) + } > > foo <- myfun(n1 = 1, n2 = 1, n3 = 5) > > foo $mat1 [,1] [,2] [,3] [1,]111 $mat2 [,1] [,2] [,3] [,4] [1,]2222 $mat3 [,1] [,2] [,3] [,4] [,5] [1,]33333 [2,]33333 [3,]33333 [4,]33333 [5,]33333 $mypkgfun function (x) inherits(x, "Surv") > HTH Steve McKinney -Original Message- From: [EMAIL PROTECTED] on behalf of Eduardo M. A. M.Mendes Sent: Thu 8/28/2008 5:43 PM To: r-help@r-project.org Subject: [R] Newbie: Examples on functions callling a library etc. Hello R is pretty new to me. I need to write a function that returns three matrices of different dimensions. In addition, I need to call a function from a contributed package with the function. I have browsed several manuals and docs but the examples on them are either very simple or extremely hard to follow. Many thanks Ed [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] conditional with and operators
Did you try it with the vector '&' and operator? d<-sapply(res,function(.df){(.df$TimesVisited[.df$Tick>912 & .df$Id>0])}) (The '&&' operator is designed for use in e.g. if() clauses where you want a scalar logical answer) HTH Steve McKinney -Original Message- From: [EMAIL PROTECTED] on behalf of Altaweel, Mark R. Sent: Tue 8/19/2008 1:10 PM To: r-help@r-project.org Subject: [R] conditional with and operators Hi, I have a problem in which I am parsing data from a list. I am simply trying to return data that has several conditions being true. Here is my syntax below: d<-sapply(res,function(.df){(.df$TimesVisited[.df$Tick>912 && .df$Id>0])}) #res is the list, and I am trying to return a result that has two true conditions (that is the variable Tick should be greate than 912 and Id should be greater than 0 This returns an array of 10 with integer values of 0. This is the incorrect result However, if I do the same syntax except I remove the && statement (i.e. the second conditional), then the result producing something that makes sense, which is all values that are Tick and greater than 912. Can someone let me know how I can setup my data to be parsed so I can have 2 or multiple conditionals in my function that is looking at an array. Thanks in advance. Mark __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Detecting duplicate values
> -Original Message- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > On Behalf Of [EMAIL PROTECTED] > Sent: Thursday, August 14, 2008 2:00 PM > To: r-help@r-project.org > Subject: [R] Detecting duplicate values > > This is another "how do I do it" type of question. It seems that for a > function that I have it has a hard time with lists that have a single > repeated value. I want a function (or expression) that will detect this > condition. I want to detect: > > c(2,2,2,2,2) > OR > c(1) > > The following is OK > > c(2,3,3,2) > OR > c(1,2,1) > > These lists are "OK" because they have another value in the list. In other > words I want to detect if there is only one element in the list that > repeats 1 or more times. > > Will length(table(a)) = 1 always work? If so I have answered my own > question. If not please bear with me as I have missed a case. Consider also length(unique(a)) == 1 Steve McKinney > > Thank you. > > Kevin > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] missing TRUE/FALSE error in conditional construct
> -Original Message- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > On Behalf Of rcoder > Sent: Wednesday, August 13, 2008 3:17 PM > To: r-help@r-project.org > Subject: [R] missing TRUE/FALSE error in conditional construct > > > Hi everyone, > > I posted something similar to this in reply to another post, but there > seems > to be a problem getting it onto the board, so I'm starting a new post. > > I am trying to use conditional formatting to select non-zero and non-NaN > values from a matrix and pass them into another matrix. The problem is > that > I keep encountering an error message indicating the ":missing value where > TRUE/FALSE needed " > > My code is as follows: > > ##Code Start > mat_zeroless<-matrix(NA,5000,2000) #generating holding matrix > > ##Assume source matrix containing combination of values, NaNs and zeros## > for (j in 1:5000) > { > for (k in 1:2000) > { > if(mat[j,k]!=0 & !is.NaN(mat[j,k])) {mat_zeroless[j,k]<-mat[j,k]} > } > } > ##Code End > > Error in if (mat[j,k] !=0 & !is.NaN(mat[j,k])) { :missing value where > TRUE/FALSE needed > > I'm not sure how to resolve this. This seems to do what you appear to need, no loops required (always best whenever possible): > set.seed(123) > mat <- matrix(sample(1:10), nrow = 5) > mat [,1] [,2] [1,]31 [2,]8 10 [3,]49 [4,]72 [5,]65 > is.na(mat) <- c(2, 3) > mat [,1] [,2] [1,]31 [2,] NA 10 [3,] NA9 [4,]72 [5,]65 > mat[5,] <- 0 > mat [,1] [,2] [1,]31 [2,] NA 10 [3,] NA9 [4,]72 [5,]00 > matno0 <- matrix(NA, nrow = nrow(mat), ncol = ncol(mat)) > matno0 [,1] [,2] [1,] NA NA [2,] NA NA [3,] NA NA [4,] NA NA [5,] NA NA > mat[!is.na(mat) & !(mat == 0)] [1] 3 7 1 10 9 2 > matno0[!is.na(mat) & !(mat == 0)] <- mat[!is.na(mat) & !(mat == 0)] > matno0 [,1] [,2] [1,]31 [2,] NA 10 [3,] NA9 [4,]72 [5,] NA NA > Other issues: The error messages you are seeing generally are happening because you are generating non-scalar TRUE / FALSE outcomes or NA outcomes in your if() clause. An if() clause should generate scalar TRUE or FALSE only. To this end, the S language provides the operators '&&' (scalar AND) and '||' (scalar OR) for use in logical scalar clauses. Further, if the first argument to '&&' tests FALSE, the second is not even evaluated, so you could have done if ( !is.na(mat[j, k]) && mat[j, k] != 0 ) { blah } In this case, once the NA is found, the if() clause evaluation is over and you don't have to worry about the NA that will result from the second argument mat[j, k] != 0 > matno0 <- matrix(NA, nrow = nrow(mat), ncol = ncol(mat)) > matno0 [,1] [,2] [1,] NA NA [2,] NA NA [3,] NA NA [4,] NA NA [5,] NA NA > for ( j in 1:5 ) { + for ( k in 1:2 ) { + if ( !is.na(mat[j, k]) && mat[j, k] != 0 ) { matno0[j, k] <- mat[j, k] } + } + } > > matno0 [,1] [,2] [1,]31 [2,] NA 10 [3,] NA9 [4,]72 [5,] NA NA > HTH Steve McKinney > > rcoder > -- > View this message in context: http://www.nabble.com/missing-TRUE-FALSE- > error-in-conditional-construct-tp18972244p18972244.html > Sent from the R help mailing list archive at Nabble.com. > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conditional statement used in sapply()
> -Original Message- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > On Behalf Of Altaweel, Mark R. > Sent: Wednesday, August 13, 2008 3:03 PM > To: r-help@r-project.org > Subject: [R] Conditional statement used in sapply() > > Hi, > > I have data stored in a list that I would like to aggregate and perform > some basic stats. However, I would like to apply conditional statements so > that not all the data are used. Basically, I want to get a specific > variable, do some basic functions (such as a mean), but only get the data > in each element's data that match the condition. The code I used is below: > > > result<-sapply(res, function(.df) { #res is the list containing file > data > + if(.df$Volume>0)mean(.df$Volume) #only have the mean function calculate > on values great than 0 > + }) > You probably want something such as result<-sapply(res, function(.df) { mean(.df$Volume[.df$Volume>0]) }) HTH Steve McKinney > > I did get a numeric output; however, when I checked the output value the > conditional was ignored (i.e. it did not do anything to the calculation) > > I also obtained these warning statements: > > Warning messages: > 1: In if (.df$Volume > 0) mean(.df$Volume) : > the condition has length > 1 and only the first element will be used > 2: In if (.df$Volume > 0) mean(.df$Volume) : > the condition has length > 1 and only the first element will be used > > Please let me know what am I doing wrong and how can I apply a conditional > statement to the sapply function. > > Thanks > > Mark > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ignoring zeros or converting to NA
The help page for is.na() is worth reading repeatedly. > -Original Message- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > On Behalf Of stephen sefick > Sent: Tuesday, August 12, 2008 6:53 PM > To: Charles C. Berry > Cc: Mike Prager; [EMAIL PROTECTED] > Subject: Re: [R] ignoring zeros or converting to NA > > I have been reading this thread and I am having a hard interpreting > what these mean. I know that the result is that all of the values > that are zero in a are replaced by NA. Let me try and write it out > > is.na(a[a==0] ) <- TRUE > you pull out of a all of the times that are equal to zero then is.na > tests and returns false then all of the false values are set to true? Not quite - the function 'is.na<-' needs an index vector as input. So my understanding is that all the values in 'a' that are zero will have their value set to NA, because R will replicate the TRUE (vector of length 1) to a vector of length (sum(a==0)), this being the index vector. > > is.na(a) <- a==0 > make values in a NA when a is 0? Right. a==0 generates a logical vector with length (length(a)) with value 'TRUE' wherever 'a' has a zero entry. 'is.na<-' will then set those entries to NA. The entries where 'a==0' is FALSE will not have their values changed to NA. The version ' is.na(a[a==0] ) <- TRUE ' is doing the same thing, the logical index vector for ' a[a==0] ' consists of all 'TRUE'. HTH Steve McKinney > > is this right? what the logic if not? > thanks > > Stephen Sefick > > > On Tue, Aug 12, 2008 at 6:32 PM, Charles C. Berry <[EMAIL PROTECTED]> > wrote: > > On Tue, 12 Aug 2008, Mike Prager wrote: > > > >> rcoder <[EMAIL PROTECTED]> wrote: > >> > >>> I have a matrix that has a combination of zeros and NAs. When I > perform > >>> certain calculations on the matrix, the zeros generate "Inf" values. > Is > >>> there a way to either convert the zeros in the matrix to NAs, or only > >>> perform the calculations if not zero (i.e. like using something > similar > >>> to > >>> an !all(is.na() construct)? > >> > >> Is this what you are looking for? > >> > >>> # make some data > >>> a = matrix(c(rep(0,6), rep(2,6)), nrow = 4) > >>> a > >> > >>[,1] [,2] [,3] > >> [1,]002 > >> [2,]002 > >> [3,]022 > >> [4,]022 > >>> > >>> # change zero to NA > >>> is.na(a[a==0] ) <- TRUE > > > > Or > >is.na(a) <- a==0 > > > > Chuck > > > >>> a > >> > >>[,1] [,2] [,3] > >> [1,] NA NA2 > >> [2,] NA NA2 > >> [3,] NA22 > >> [4,] NA22 > >> > >> -- > >> Mike Prager, NOAA, Beaufort, NC > >> * Opinions expressed are personal and not represented otherwise. > >> * Any use of tradenames does not constitute a NOAA endorsement. > >> > >> __ > >> R-help@r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > >> > > > > Charles C. Berry(858) 534-2098 > >Dept of Family/Preventive > > Medicine > > E mailto:[EMAIL PROTECTED] UC San Diego > > http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093- > 0901 > > > > __ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > > -- > Let's not spend our time and resources thinking about things that are > so little or so large that all they really do for us is puff us up and > make us feel like gods. We are mammals, and have not exhausted the > annoying little problems of being mammals. > > -K. Mullis > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] source a script file straight from a subversion repository
Thanks to Duncan Murdoch and Marc Schwartz for their excellent help. As we don't yet have the apache http: interface to svn running yet, 'svn export' is the access mechanism. As I don't want to leave stale files lying around, or clobber a local file should there be one with the same name, R's tempfile() and Marc's ideas work very well. ## Get a local copy of the file from the ## version control repository svn on myrepo.xxx.org tempfilename <- tempfile() shellcmd <- paste("svn export ", "svn://myrepo.xxx.org/opt/svn/repos/", "Projects/SMcode/tags/v1.0/SMfuns.R ", tempfilename, sep = "") system(shellcmd) if (file.exists(tempfilename) ) { source(tempfilename) unlink(tempfilename) } else { stop("Can not get copy of SMfuns.R") } I can wrap this in a function where needed as Marc suggests. Thanks again for the help Best Steve McKinney > -Original Message- > From: Marc Schwartz [mailto:[EMAIL PROTECTED] > Sent: Fri 8/1/2008 5:42 PM > To: Steven McKinney > Cc: r-help@r-project.org > Subject: Re: [R] source a script file straight from a subversion repository > > on 08/01/2008 06:49 PM Steven McKinney wrote: > > Hi useRs > > > > I'm trying to figure out how to source > > an R script file straight from a subversion > > repository, without having to put a copy > > of the script into the local working directory. > > > > Has anyone done this? > > > > Something such as > > > > source(file = paste("svn://myrepo.xxx.org/opt/svn/repos/", > > "/Projects/SMcode/tags/v1.0/SMfuns.R", > > sep = "")) > > > > though this does not work. > > > > Perhaps I need a connection of some sort > > instead of just a text string for the > > file argument? > > > > I'm not sure if this can work, and have not > > found an example searching through the archives > > and help manuals as yet. > > > > If anyone has done this and can offer any > > tips I'd appreciate it. > > The problem is that the files in a svn repo are not stored as directly > addressable files. They are stored in a database, typically using the > Berkeley DB, but will use FSFS if NFS is to be used. You can't get to > them without going through the svn interface (either CLI or one of the > GUIs or something like Trac). > > Thus, you either have to do a checkout ('svn co'), which means an entire > directory tree or sub-tree, or do an export ('svn export') to access a > single file. In the former case, you get a 'working copy' of the tree > which can be managed by svn. In the latter case, you get a single file, > but one that is not under svn control. > > To do what you want, some variation of: > > 1. 'svn export' the file > 2. source() the file > 3. delete the file > > would be required. You can wrap all of that in a function to shield the > individual steps, but at minimum, the 'svn export' would be required to > get to the single file locally. You can call the 'svn export' using the > R system() function, thus doing it all within an R session. > > So something like: > > SVNSource <- function(FullPathToSVNFileName) > { >cmd <- paste("svn export", FullPathToSVNFileName) >system(cmd) >source(basename(FullPathToSVNFileName)) >file.remove(basename(FullPathToSVNFileName)) > } > > The basename() R function returns just the file name, stripping the path > component. > > Thus, given your example: > > SVNSource("svn://myrepo.xxx.org/opt/svn/repos/Projects/SMcode/tags/v1.0/SMfuns.R") > > should hopefully work, but I would test it all to be sure. > > This presumes that the command: > > $ svn export > svn://myrepo.xxx.org/opt/svn/repos/Projects/SMcode/tags/v1.0/SMfuns.R > > works from the CLI to begin with to get the file locally. > > HTH, > > Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] source a script file straight from a subversion repository
Hi useRs I'm trying to figure out how to source an R script file straight from a subversion repository, without having to put a copy of the script into the local working directory. Has anyone done this? Something such as source(file = paste("svn://myrepo.xxx.org/opt/svn/repos/", "/Projects/SMcode/tags/v1.0/SMfuns.R", sep = "")) though this does not work. Perhaps I need a connection of some sort instead of just a text string for the file argument? I'm not sure if this can work, and have not found an example searching through the archives and help manuals as yet. If anyone has done this and can offer any tips I'd appreciate it. Best Steve McKinney __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sort rows of matrix by rows of another matrix
I came up with the same solution as Mark Leeds. > a1roworders <- t(apply(a1, 1, order)) > a2ord <- t(sapply(seq(nrow(a1)), function(x) a2[x, a1roworders[x,]] )) > a2ord [,1] [,2] [,3] [1,] 102 101 103 [2,] 102 101 103 [3,] 103 101 102 [4,] 101 102 103 Mark's question about the orientation of apply() output is common, and a source of confusion for many who are used to matrix results that just show up in the right orientation from most other S-language function output. The help page for apply() states "If each call to FUN returns a vector of length n, then apply returns an array of dimension c(n, dim(X)[MARGIN]) if n > 1." So, whenever you use apply() on margin 1 (the rows) and your function returns a result for each column, each call to FUN returns a vector of length n = ncol(X), and this dimension becomes the row dimension of the result (i.e. column dimensions and row dimensions appear mysteriously switched). Since apply() can be used on multiple margins of arrays with many margins, some standard output format had to be set up, and the format is stated above. So this is just one of those bits of S language trivia that you have to file away in the grey matter - since apply() can be used on arbitrary arrays, it has to have a standard output format, and that standard appears to transpose your result matrix when you do column-wise stuff to the rows of your two-dimensional matrix. The old Blue Book ("The New S Language") showed examples apply(x, 2, sort) # sort columns of x t(apply(x, 1, sort)) # transpose result of row sort and commented "The sorting examples show the difference between row and column operations when the results returned by FUN are vectors. The returned value becomes the dimension of the result, hence the transpose is necessary with row sorts." This was a useful comment and pointed this issue out plainly. I'd vote for this set of examples to be in R's help page for apply(), with the comment as well, e.g. apply(x, 2, sort) # sort columns of x t(apply(x, 1, sort)) # transpose result of row sort ## The sorting examples show the difference between row and ## column operations when the results returned by FUN are vectors. ## The returned value becomes the dimension of the result, ## hence the transpose is necessary with row sorts." would be useful examples and comments for R's apply() help page. HTH Steve McKinney -Original Message- From: [EMAIL PROTECTED] on behalf of [EMAIL PROTECTED] Sent: Thu 7/31/2008 3:34 PM To: Johnson, Eric A. (Seattle) Cc: r-help@r-project.org Subject: Re: [R] sort rows of matrix by rows of another matrix below is another way ( maybe the same ? )but with an extra line to make roworder. i'm also not clear on why I have to take the transpose of the result that comes back form the apply call. in ?apply, it says that the function tries to convert the result back to a matrix. that's fine but why does it do it in the opposite way from the way the data in sent in ( i.e : by row ). if someone could explain that, i'd appreciate it. a1 <- structure(c(7, 4, 4, 0, 6, 2, 7, 3, 8, 4, 2, 8), .Dim = c(4L, 3L)) a2 <- structure(c(101L, 101L, 101L, 101L, 102L, 102L, 102L, 102L, 103L, 103L, 103L, 103L), .Dim = c(4L, 3L)) roworder <- t(apply(a1,1,order)) a3 <- t(sapply(1:nrow(a2),function(.rowind) { a2[.rowind,roworder[.rowind,]] })) print(a3) On Thu, Jul 31, 2008 at 6:05 PM, Johnson, Eric A. (Seattle) wrote: > If you're not adverse to cbind-ing a1 and a2, you can use this: > > a1a2 <- cbind(a1, a2) > > a3 <- t(apply(a1a2, 1, function(x) x[order(x[1:ncol(a1)])+ncol(a1)])) > > Eric > > > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] > On Behalf Of Timothy W. Hilton > Sent: Thursday, July 31, 2008 2:19 PM > To: r-help@r-project.org > Subject: [R] sort rows of matrix by rows of another matrix > > Hello all, > > I am trying to sort rows of one matrix by rows of another. Given a1 > and > a2: > > -- >> a1 > [,1] [,2] [,3] > [1,]768 > [2,]424 > [3,]472 > [4,]038 > > a1 <- > structure(c(7, 4, 4, 0, 6, 2, 7, 3, 8, 4, 2, 8), .Dim = c(4L, 3L)) > >> a2 > [,1] [,2] [,3] > [1,] 101 102 103 > [2,] 101 102 103 > [3,] 101 102 103 > [4,] 101 102 103 > > a2 <- > structure(c(101L, 101L, 101L, 101L, 102L, 102L, 102L, 102L, 103L, > 103L, 103L, 103L), .Dim = c(4L, 3L)) > -- > > I want to get a3: > >> a3 > [,1] [,2] [,3] > [1,] 102 101 103 > [2,] 102 101 103 > [3,] 103 101 102 > [4,] 101 102 103 > > where the rows of a3 are the rows of a2 sorted according to the rows > of > a1. > > I can get the necessary sorting index: >> apply(a1, 1, order) > [,1] [,2] [,3] [,4] > [1,]2231 > [2,]1112 > [3,]3323 > > and I can get the rows of a1 sorted according to the rows of a1: >> t(apply(a1, 1, function(x) x[order(x)])) > [,1] [,2] [,3] > [1,]6
Re: [R] 64-bit R on Mac OS X 10.4.5
Hello I haven't found better instructions, it's just not an easy thing to do. You might also consider joining the r-sig-mac group and reviewing threads there for additional information. Rather than try to configure, make and install with one giant command, I'd suggest breaking the task down until you have worked out all the details. Then you can build more routinely with giant commands such as the one you report below. Note that the page of the URL you posted says this: "R on Mac OS X 10.5 (Leopard)" so look around the r.research.att.com website for older 10.4 related instructions. OS X 10.5 is different enough from 10.4 that copying and pasting the 10.5 instructions will not work everywhere. You don't say what kind of computer you have (Intel or PowerPC), if you have a power pc then -arch x86_64 is not right. So do report your hardware configuration as well as your operating system configuration. Try breaking up the build process into steps and review the messages generated at each step. e.g. the configure step: Just do these bits cd rd64 ../R-devel/configure SHELL='/bin/bash' \ r_arch=x86_64 \ CC="gcc -arch x86_64 -std=gnu99" \ CXX="g++ -arch x86_64" \ OBJC="gcc -arch x86_64" \ F77="gfortran -arch x86_64" \ FC="gfortran -arch x86_64" \ --with-system-zlib \ --with-blas='-framework vecLib' \ --with-lapack 1> configure.R.txt 2>&1 The end bits of this version of the configure command redirect output and error messages to a file that you can then read, to see which bits you are missing and which bits cause problems. It's tough to catch those as they scroll by in a terminal window. If configure runs without any signs of trouble, try the make. make 1> make.R.txt 2>&1 Then you can review all the make output for signs of problems and error messages. Here's a configure command that worked for me to build a 64-bit R on 10.4 a while back: ./configure --host=powerpc64-apple-darwin8.10.0 --build=powerpc64-apple-darwin8.10.0 \ --prefix=/usr/local/lib64 'CC=gcc-4.0 -arch ppc64' 'CXX=g++ -arch ppc64' \ 'FC=gfortran-4.0 -arch ppc64' 'F77=gfortran-4.0 -arch ppc64' \ 'CFLAGS=-g -O3 -mtune=G5 -mcpu=G5' 'FFLAGS=-g -O3 -mtune=G5 -mcpu=G5' \ 'LDFLAGS=-arch ppc64 -m64 -L/usr/local/lib' 'CXXFLAGS=-g -O3 -mtune=G5 -mcpu=G5' \ 'FCFLAGS=-g -O3 -mtune=G5 -mcpu=G5' --disable-R-framework --enable-R-shlib \ '--with-blas=-framework vecLib' --with-lapack --without-iconv 1> configure.R.txt 2>&1 and also note that you need to have root privileges when doing the make install, so you either need to run the command as root (not so common on Mac OS X) or run sudo make install and enter your password etc. HTH Steve McKinney -Original Message- From: [EMAIL PROTECTED] on behalf of joseph Sent: Fri 7/25/2008 5:07 PM To: r-help@r-project.org Cc: r-help@r-project.org Subject: [R] 64-bit R on Mac OS X 10.4.5 Hello I have a Mac OS X 10.4.5. I am trying to build a 64-bit R by following the directions on this page: http://r.research.att.com/building.html r_arch=x86_64 \ CC="gcc -arch x86_64 -std=gnu99" \ CXX="g++ -arch x86_64" \ OBJC="gcc -arch x86_64" \ F77="gfortran -arch x86_64" \ FC="gfortran -arch x86_64" PATH=/usr/X11/bin:/usr/local/bin:$PATH export PATH LANG=C export LANG svn co https://svn.r-project.org/R/trunk R-devel # you may have to accept a certificate here cd R-devel tools/rsync-recommended cd .. # got the sources, on to building 64-bit R mkdir rd64 cd rd64 ../R-devel/configure SHELL='/bin/bash' \ r_arch=x86_64 \ CC="gcc -arch x86_64 -std=gnu99" \ CXX="g++ -arch x86_64" \ OBJC="gcc -arch x86_64" \ F77="gfortran -arch x86_64" \ FC="gfortran -arch x86_64" \ --with-system-zlib \ --with-blas='-framework vecLib' --with-lapack && \ make -j4 && \ make check && \ make install cd ..when I try to run it by typing R, it gives me the following error: -bash: R: command not found Can any body help me to solve this problem or direct me to better step-by-step instructions. Thanks Joseph [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calling LISP programs in R
Have you reviewed the old xlisp machinery that used to be in R? Check out the RXLisp library at http://www.omegahat.org/RXLisp/ and see if this will work. The PDF http://www.omegahat.org/RXLisp/examples.pdf reviews calling xlisp from R. HTH Steve McKinney -Original Message- From: [EMAIL PROTECTED] on behalf of francogrex Sent: Wed 7/23/2008 7:06 AM To: r-help@r-project.org Subject: [R] Calling LISP programs in R I have written some programs in Common Lisp and I have been using SAS to pipe those programs to my lisp compiler in batch mode by using the %xlog and %xlst SAS commands. I wonder if there is in R a similar way to pipe commands to LISP so that all my work would be concentrated in R even when I have to call a LISP program? I have looked at the foreign library but this seems to adjust data types not for piping commands in batch mode. Here is a simple program (example) to generate random normal variables from SAS to LISP; of course it's a toy example because there is no need for a LISP routine in this particular case. I hope R has a similar feature. Thanks DATA _NULL_; FILE 'c:\cl.lisp' LRECL=1024; PUT "(defun run (num Mn SD)"; PUT "(setq mix (list nil))"; PUT "(dotimes (n num)"; PUT"(setq u (- (* 2 (random 1.0)) 1)"; PUT"v (- (* 2 (random 1.0)) 1)"; PUT"w (+ (expt u 2) (expt v 2)))"; PUT"(when (< w 1)"; PUT "(progn"; PUT"(setq z (sqrt (/ (* -2 (log w)) w))"; PUT"x (* u z)"; PUT"y (* v z)"; PUT "mix (append (list x) mix)"; PUT "mix (append (list y) mix)"; PUT "(setq mix (remove nil mix)"; PUT "mixnew (loop for x in mix append (list(+ Mn (* SD x)"; PUT "(print mixnew)"; PUT "(flet(( mean(zlist)"; PUT "(setq sum (loop for x in zlist sum x))"; PUT "(setq m (/ sum (length zlist)"; PUT '(pprint (list "The mean is:" (mean mixnew'; PUT "(flet((var(zlist)"; PUT "(setq sumsq (loop for x in zlist sum (* (- x m) (- x m)))"; PUT "v (/ sumsq (- (length zlist) 1)"; PUT "(setq std (sqrt (var mixnew"; PUT '(pprint (list "The Standard Deviation is:" std)))'; PUT "(run 500 41 17.5)"; RUN; %xlog(CL); -- View this message in context: http://www.nabble.com/Calling-LISP-programs-in-R-tp18611512p18611512.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.