Re: [R] attach
On Wed, 2009-10-14 at 07:21 +0200, Christophe Dutang wrote: Hi all, I have a question regarding the memory usage for the attach function. Say I have a data.frame inputdat that I create with read.csv. I would like to know what happens on the memory side when I use attach(inputdata) Is there a second allocation of memory for inputdata? Then I'm using eval on a expression which depends on the columns of inputdata. Is it better not to use attach function? Thanks in advance Christophe Well, if you attach a data.frame twice times, it use your memory twice times. I don't use attach I prefer with(data.frame, command) -- Bernardo Rangel Tura, M.D,MPH,Ph.D National Institute of Cardiology Brazil __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How do I access with the name of a (passed) function
On 17/10/2009 7:26 AM, Ajay Shah wrote: How would I do something like this: f - function(x, g) { s - as.character(g) # THIS DOES NOT WORK sprintf(The %s of x is %.0f\n, s, g(x)) } Gabor showed you how to do it if you pass an expression which evaluates to a function. If you want to pass an expression that returns a character string as below, use if (is.character(g)) { s - g g - get(s, parent.frame()) # gets it from the caller's frame } f(c(2,3,4), median) f(c(2,3,4), mean) and get the results The median of x is 3 The mean of x is 3 Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] function to convert lm model to LaTeX equation
Ista Zahn wrote: Dear list, I've tried several times to wrap my head around the Design library, without much success. It does some really nice things, but I'm often uncomfortable because I don't understand exactly what it's doing. Anyway, one thing I really like is the latex.ols() function, which converts an R linear model formula to a LaTeX equation. So, I started writing a latex.lm() function (not actually using classes at this point, I just named it that for consistency). This turned out to be easy enough for simple cases (see code below), but now I'm wondering a) if anyone knows of existing functions that do this (again, for lm() models, I know I'm reinventing the wheel in as far as the Design library goes), or if not, b) if anyone has suggestions for improving the function below. Thanks, Ista ### Function to create LaTeX formula from lm() model. Needs amsmath package in LaTeX. ### latex.lm - function(object, file=, math.env=c($,$), estimates=none, abbreviate = TRUE, abbrev.length=8, digits=3) { # Get and format IV names co - c(Int, names(object$coefficients)[-1]) co.n - gsub(p.*), , co) if(abbreviate == TRUE) { co.n - abbreviate(gsub(p.*), , co), minlength=abbrev.length) } # Get and format DV m.y - strsplit((as.character(object$call[2])), ~ )[[1]][1] # Write coefficent labels b.x - paste(\\beta_{, co.n ,}, sep=) # Write error term e - \\epsilon_i # Format coefficint x variable terms m.x - sub(}Int,}, paste(b.x, co.n, + , sep=, collapse=)) # If inline estimates convert coefficient labels to values if(estimates == inline) { m.x - sub(Int, , paste(round(object$coefficients,digits=digits), co.n, + , sep=, collapse=)) m.x - gsub(\\+ \\-, -, m.x) } # Format regression equation eqn - gsub(:, \\times , paste(math.env[1], m.y, = , m.x, e, sep=)) # Write the opening math mode tag and the model cat(eqn, file=file) # If separae estimates format estimates and write them below the model if(estimates == separate) { est - gsub(:, \\times , paste(b.x, = , round(object$coefficients, digits=digits), , , sep=, collapse=)) cat(, \n \\text{where }, substr(est, 1, (nchar(est)-2)), file=file) } # Write the closing math mode tag cat(math.env[2], \n, file=file) } # END latex.lm Xvar1 - rnorm(20) Xvar2 - rnorm(20) Xvar3 - factor(rep(c(A,B),10)) Y.var - rnorm(20) D - data.frame(Xvar1, Xvar2, Xvar3, Y.var) x1 - lm(Y.var ~ pol(Xvar1, 3) + Xvar2*Xvar3, data=D) latex.lm(x1) It's not reinventing the wheel, in the sense that you are not attempting to handle the most needed features (simplifying regression splines and factoring out interaction terms with brackets). I don't think you followed the posting guide though. You didn't state your exact problem with Design and you didn't include any code. Also note that the Design package is replaced with the rms package although latex features have not changed. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] function to convert lm model to LaTeX equation
On Sun, Oct 18, 2009 at 9:09 AM, Frank E Harrell Jr f.harr...@vanderbilt.edu wrote: Ista Zahn wrote: Dear list, I've tried several times to wrap my head around the Design library, without much success. It does some really nice things, but I'm often uncomfortable because I don't understand exactly what it's doing. Anyway, one thing I really like is the latex.ols() function, which converts an R linear model formula to a LaTeX equation. So, I started writing a latex.lm() function (not actually using classes at this point, I just named it that for consistency). This turned out to be easy enough for simple cases (see code below), but now I'm wondering a) if anyone knows of existing functions that do this (again, for lm() models, I know I'm reinventing the wheel in as far as the Design library goes), or if not, b) if anyone has suggestions for improving the function below. Thanks, Ista ### Function to create LaTeX formula from lm() model. Needs amsmath package in LaTeX. ### latex.lm - function(object, file=, math.env=c($,$), estimates=none, abbreviate = TRUE, abbrev.length=8, digits=3) { # Get and format IV names co - c(Int, names(object$coefficients)[-1]) co.n - gsub(p.*), , co) if(abbreviate == TRUE) { co.n - abbreviate(gsub(p.*), , co), minlength=abbrev.length) } # Get and format DV m.y - strsplit((as.character(object$call[2])), ~ )[[1]][1] # Write coefficent labels b.x - paste(\\beta_{, co.n ,}, sep=) # Write error term e - \\epsilon_i # Format coefficint x variable terms m.x - sub(}Int,}, paste(b.x, co.n, + , sep=, collapse=)) # If inline estimates convert coefficient labels to values if(estimates == inline) { m.x - sub(Int, , paste(round(object$coefficients,digits=digits), co.n, + , sep=, collapse=)) m.x - gsub(\\+ \\-, -, m.x) } # Format regression equation eqn - gsub(:, \\times , paste(math.env[1], m.y, = , m.x, e, sep=)) # Write the opening math mode tag and the model cat(eqn, file=file) # If separae estimates format estimates and write them below the model if(estimates == separate) { est - gsub(:, \\times , paste(b.x, = , round(object$coefficients, digits=digits), , , sep=, collapse=)) cat(, \n \\text{where }, substr(est, 1, (nchar(est)-2)), file=file) } # Write the closing math mode tag cat(math.env[2], \n, file=file) } # END latex.lm Xvar1 - rnorm(20) Xvar2 - rnorm(20) Xvar3 - factor(rep(c(A,B),10)) Y.var - rnorm(20) D - data.frame(Xvar1, Xvar2, Xvar3, Y.var) x1 - lm(Y.var ~ pol(Xvar1, 3) + Xvar2*Xvar3, data=D) latex.lm(x1) It's not reinventing the wheel, in the sense that you are not attempting to handle the most needed features (simplifying regression splines and factoring out interaction terms with brackets). I don't think you followed the posting guide though. You didn't state your exact problem with Design and you didn't include any code. Also note that the Design package is replaced with the rms package although latex features have not changed. Thank you for your response Prof. Harrell. Sorry my original post didn't meet the guidelines -- it was poorly worded I'm afraid. The question was not about the Design package, but about how to represent a lm() model as a LaTeX equation, and specifically whether anyone had already written code for this task, and if not how the function I wrote could be improved. Thank you for you're suggestions about needing to handle regression splines and factoring out interaction terms, that's very helpful. Thanks again, Ista -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] QQ plot
Madan Sigdel wrote: Dear users I have applied following for my works: library(car) x-scan() 1: 0.92545 2: 0.89321 3: 0.9846 4: 2.9 5: 0.85968 6: 5.2 7: 4.66 8: 1.18788 9: 1.07683 10: 1.07683 11: 8.38 12: 7.423 13: 0.972 14: 3.73 15: 1.06474 16: 1.48 17: 0.92876 18: 2.26493 19: 0.85696 20: 1.89313 21: 2.71 22: 5 23: 3.02 24: 0.90369 25: 8.81 26: 1.69466 27: 1.07055 28: 1.17077 29: 2.31647 30: 0.83481 31: 5.42 32: 9.68 33: 1.27294 34: 5.49 35: 2.48 36: 1.55876 37: 1.41419 38: 0.94503 39: 5.24 40: Read 39 items x-rgamma(39,shape=1.7,scale=1.6) qq.plot(x,dist=gamma,shape=1.7,scale=1.6) In this way can I get correct figure? My doubt is in the resulting figure, gamma quantile are in X-axis and emperical are in Y-axis, however the values in the Y-axis are up to 15. But my maximum point in data set is 9.68. How can it be correct? As far as I can see, it is not your data that you are plotting, but rgamma(..). -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] optimization problem with constraints..
Apologies if this shows up a second time with uninformative header (apparently it got filtered, but ...), as I forgot to replace the subject line. As a first try, use a bounds constrained method (L-BFGS-B or one from the r-forge Optimizer project http://r-forge.r-project.org/R/?group_id=395) and then add a penalty or barrier function to your objective function to take care of the x1+x2 1 (the other end is implicit in the lower bounds on x1 and x2). e.g., - const * log(1-x1-x2) You should provide a feasible starting point. const scales the penalty. Cheers, JN Message: 27 Date: Sat, 17 Oct 2009 13:50:10 -0700 (PDT) From: kathie kathryn.lord2...@gmail.com Subject: [R] optimization problem with constraints... To: r-help@r-project.org Message-ID: 25941686.p...@talk.nabble.com Content-Type: text/plain; charset=us-ascii Dear R users, I need some advises on how to use R to optimize a nonlinear function with the following constraints. f(x1,x2,x3,x4,x5,x6) s.t 0 x1 1 0 x2 1 0 x1+x2 1 -inf x3 inf -inf x4 inf 0 x5 inf 0 x6 inf Is there any built-in function or something for these constraint?? Any suggestion will be greatly appreciated. Regards, Kathryn Lord -- View this message in context: http://www.nabble.com/optimization-problem-with-constraints...-tp25941686p25941686.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Putting names on a ggplot
Thanks Stefan, the annotate approach works beautifully. I had not got that far in Hadley's book apparently :( I'm not convinced though that the explaination you shouldn't use aes in this case since nampost, temprange, ... are not part of the dataframe year. makes sense since it seems to work in this case unless I am missing something obvious. Mind you I'm good at missing the obvious especially in ggplot. Example = library(ggplot2) month - rep(month.abb[1:12], each=30) days - rep(1:30, 12) temps - rnorm(length(month), mean=25,sd=8) duration - 1:length(month) timedata - data.frame(month, days, temps, duration) head(timedata) mbs - c(1,seq(30, 360,by=30)) namposts - mbs[1:12] mlabs - month.name[1:12] trange - range(timedata$temps) drange - range(duration) p - ggplot(timedata, aes(duration, temps, colour=month)) + geom_line() + opts(legend.position = none, title=Yearly temperatures, axis.text.x = theme_blank(), axis.ticks = theme_blank()) p - p + geom_vline(xintercept= mbs) + ylab(Temperature (C)) + xlab(Daily Temperatures) + geom_text(aes(x = namposts+2.5, y = trange[2], label = mlabs), data = timedata, size = 2.5, colour='black', hjust = 0, vjust = 0) p = --- On Sat, 10/17/09, m...@z107.de m...@z107.de wrote: From: m...@z107.de m...@z107.de Subject: Re: [R] Putting names on a ggplot To: John Kane jrkrid...@yahoo.ca Cc: R R-help r-h...@stat.math.ethz.ch Received: Saturday, October 17, 2009, 5:53 PM hi, On Sat, Oct 17, 2009 at 02:04:43PM -0700, John Kane wrote: Putting names on a ggplot p - p + geom_text(aes(x = namposts + 2.5, y = temprange[2], label = mlabs), data = year, size = 2.5, colour='black', hjust = 0, vjust = 0) you shouldn't use aes in this case since nampost, temprange, ... are not part of the dataframe year. It should also work with geom_text i guess, but I prefere annotate for thinks like that: p + annotate(text,x=namposts+2.5, y = temprange[2], label= mlabs,size=2.5,colour='black', hjust = 0, vjust = 0) regards, stefan __ Yahoo! Canada Toolbar: Search from anywhere on the web, and bookmark your favourite sites. Download it now http://ca.toolbar.yahoo.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] looking for reference that covers convergence in distribution
Peng Yu wrote: On Sat, Oct 17, 2009 at 3:28 PM, Peter Dalgaard p.dalga...@biostat.ku.dk wrote: Ben Bolker wrote: Peng Yu wrote: I am looking for a good probability book that describes convergence in distribution. I have looked through Introduction to Probability by Charles M. Grinstead, J. Laurie Snell, but I don't find any formal description on convergence in distribution. Could somebody recommend a good book that cover this topic? Thank you! This mailing list is for R help, not general statistics help. May I respectfully request that you take your questions to a statistics help list instead? You may want to check out the R package ConvergenceConcepts, though. Supporting article due to appear in the next issue of the R Journal. I have checked sci.stat.math before my original post. But it is seriously flooded with junk posts. Before this problem is fixed, it is probably not very helpful to post anything there. I know my question is rudimentary. There are so many probability and statistics textbook online, and it is difficult for me to figure out which one fits my need. If you happen to know which book is the best for me to learn convergence in distribution, please let me know. Thank you! I wonder if All of Statistics by Wasserman would be useful. It's uneven but seems to be a pretty good whirlwind review of various topics. -- View this message in context: http://www.nabble.com/looking-for-reference-that-covers-convergence-in-distribution-tp25940017p25947801.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R-help Digest, Vol 80, Issue 18
As a first try, use a bounds constrained method (L-BFGS-B or one from the r-forge Optimizer project http://r-forge.r-project.org/R/?group_id=395) and then add a penalty or barrier function to your objective function to take care of the x1+x2 1 (the other end is implicit in the lower bounds on x1 and x2). e.g., - const * log(1-x1-x2) You should provide a feasible starting point. const scales the penalty. Cheers, JN Message: 27 Date: Sat, 17 Oct 2009 13:50:10 -0700 (PDT) From: kathie kathryn.lord2...@gmail.com Subject: [R] optimization problem with constraints... To: r-help@r-project.org Message-ID: 25941686.p...@talk.nabble.com Content-Type: text/plain; charset=us-ascii Dear R users, I need some advises on how to use R to optimize a nonlinear function with the following constraints. f(x1,x2,x3,x4,x5,x6) s.t 0 x1 1 0 x2 1 0 x1+x2 1 -inf x3 inf -inf x4 inf 0 x5 inf 0 x6 inf Is there any built-in function or something for these constraint?? Any suggestion will be greatly appreciated. Regards, Kathryn Lord -- View this message in context: http://www.nabble.com/optimization-problem-with-constraints...-tp25941686p25941686.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Proper syntax for using varConstPower in nlme
On Fri, Oct 16, 2009 at 3:56 PM, Michael A. Gilchrist mi...@utk.edu wrote: Hi Dieter, Thanks for the reply. I had played with the initial conditions, but apparently not enough. I finally found some that avoided the singularity issue. Many thanks. More generally, I went over the documentation yet again in PB and I still find it a bit confusing. They talk about using form = ~fitted(.) when discussing varPower, but the rest of the documentation seems to suggest that form = ~... should be used to indicate which covariate you assume the variance changes with. Could you or someone else provide some clarification? I don't have PB in front of me, but see the 'form' agument definition on the help page for any of the variance structures depending on a covariate (varPower, varConstPower or varExp). The ~fitted(.) and ~resid(.) notation specify that you would like the variance covariate to be a function of the model being fit (the fitted values or residuals), in which case the variance parameter \delta and the model body are estimated iteratively. Conversely, if you specify a constant variance covariate such as ~age, there is no need for updating of the variance covariate during optimization. hth, Kingsford Jones Thanks. Mike On Fri, 16 Oct 2009, Dieter Menne wrote: Michael A. Gilchrist wrote: - nlme(Count ~ quad.PBMC.model(aL, aN, T0), + data = tissueData, + weights = varConstPower(form =~ Count), + start = list( fixed = c(rep(1000, 8), -2, -2) ), + fixed = list(T0 ~ TypeTissue-1, aL ~ 1, aN ~ 1), + random = aL + aN ~ 1|Tissue + ) Error in MEestimate(nlmeSt, grpShrunk) : Singularity in backsolve at level 0, block 1 You could use varPower(form=~fitted()) (the default, also varPower()). In my experience this runs into some limitation quickly with nlme, because some boundary conditions make convergence fail. Try varPower(fixed = 0.5) first and play with the number. You should only use varConstPower when you have problems with values that cover a large range, coming close to zero, which could make varPower go havoc. Always do a plot of the result; the default plot gives you residual, and some indication how to proceed. Dieter -- View this message in context: http://www.nabble.com/Proper-syntax-for-using-varConstPower-in-nlme-tp25914277p25927578.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fit copulas to data
Hello! For those of you who have ever dealt with copulas in R, youcould maybe help me. : I have used R to fit a couple of bivariate Archimedean copulas to financial data. R gives a parameter and a z-value and a third number that is supposedly some kind of p-value. An example of what I get after fitting a gumbel copula: Estimate Std. Error z value Pr(|z|) param 1.636907 0.07953911 20.579900 The maximized loglikelihood is 333.3923 The convergence code is 0 Who can tell me what the std error and the value of Pr(|z|) mean? I would guess that the fit is pretty bad due to the small p-value, but I think what is tested here is the Null that the parameter is 0. Clearly, this Null would be rejected as the p value is small. But I am not sure what this outcome really Couls somebody explain...? just give an interpretation of the outcome if the fitCopula command? Thank you so much, Emkay -- View this message in context: http://www.nabble.com/Fit-copulas-to-data-tp25948717p25948717.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Invoking par(mfrow...) on an already subdivided plot
Maxwell Reback wrote: I'd like to generate on a single device multiple plots, each of which contains two plots. Essentially, I've got sub-plots which consist of two tracks, the upper one displaying gene expression data, and the lower one mapping position. I'd like to display four of these two-track sub-plots on one device, but I can't seem to invoke the par(mfrow=...) or layout(matrix(...)) functions at more than one level. I've got something like: plot.gene-function(gene){ par(fig=c(0,1,0.3,1)) #expression: track 1 plot(...) par(fig=c(0,1,0,0.4),new=TRUE) #position: track 2 plot(...)} plot.multigene-function(gene,...){ pdf(paste(gene,.pdf,sep=)) par(mfrow=c(2,2)) tapply(gene,gene, plot.gene) dev.off()} The par(mfrow=) in plot.multigene, even when 'new=TRUE', is disregarded and I just get one sub-plot per page. Suggestions? You are right. You can use grid or other packages in order to use recursive kinds of subdivisions. Uwe Ligges (Thanks!) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to plot multiple data sets with different colors (also with legend)?
The following commands only show the data in 'y'. I'm wondering how to show the data in 'x' as well. I also want to add a legend to show that blue points corresponds to 'x' and yellow points correspond to 'y'. Could somebody let me know what the correct commands should be? x=rbind(c(10,11),c(10,11)) y=cbind(-1:0,-1:0) plot(y,col='yellow') points(x,col='blue') __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to create MULTILEVELS in a dataset??
Dear R users I have a data set which has five variables. One depenedent variable y, and 4 Independent variables (education-level, householdincome, countrygdp and countrygdpsquare). The first two are data corresponding to the individual and the next two coorespond to the country to which the individual belongs to. My data set does not make this distinction between individual level and country level. Is there a way such that I can make R make countrygdp and countrygdpsquare at a different level than the individual level data. In other words I wish to transform my dataset such that it recognizes two individual level variables to be at Level-1 and the other two country level variables at Level-2. I need to run a multilevel model, but first I must make my dataset recognise data at Level-1 and Level-2. How can I create this country level group (gdp and gdp^2) such that I can perform a multilevel model as follows: lmer(y ~ education-level + householdincome + countrygdp + countrygdpsquare + (1 I Level2),family=binomial(link=probit),data=dataset) Please kindly help me with the relevant commands for creating this Level2 (having two variables) Thanks Saurav Dr.Saurav Pathak PhD, Univ.of.Florida Mechanical Engineering Doctoral Student Innovation and Entrepreneurship Imperial College Business School s.patha...@imperial.ac.uk 0044-7795321121 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Putting names on a ggplot
hello, On Sun, Oct 18, 2009 at 08:29:19AM -0700, John Kane wrote: Thanks Stefan, the annotate approach works beautifully. I had not got that far in Hadley's book apparently :( I'm not convinced though that the explaination you shouldn't use aes in this case since nampost, temprange, ... are not part of the dataframe year. makes sense since it seems to work in this case unless I am missing something obvious. Mind you I'm good at missing the obvious especially in ggplot. hmm, I thought that aes() is used to map variables from the used dataset (data=...) to parts of the plot. But obviously this is wrong, since your example below works. In my (maybe out of date) pdf version of Hadley's book an page 46 it says: |Any variable in an aes() specification must be contained inside the |plot or |layer data. This is one of the ways in which ggplot2 objects are |guaranteed |to be entirely self-contained, so that they can be stored and |re-used. so, maybe the problem with your version is, you map variables of your environment (namposts, trange) to aesthetic, which works fine in your current environment, but fails if you save the plot and load it in another environmento. | Error in eval(expr, envir, enclos) : object 'namposts' not found If you don't use save() on plots (I do not) this maybe not a problem for you. regards, stefan Example = library(ggplot2) month - rep(month.abb[1:12], each=30) days - rep(1:30, 12) temps - rnorm(length(month), mean=25,sd=8) duration - 1:length(month) timedata - data.frame(month, days, temps, duration) head(timedata) mbs - c(1,seq(30, 360,by=30)) namposts - mbs[1:12] mlabs - month.name[1:12] trange - range(timedata$temps) drange - range(duration) p - ggplot(timedata, aes(duration, temps, colour=month)) + geom_line() + opts(legend.position = none, title=Yearly temperatures, axis.text.x = theme_blank(), axis.ticks = theme_blank()) p - p + geom_vline(xintercept= mbs) + ylab(Temperature (C)) + xlab(Daily Temperatures) + geom_text(aes(x = namposts+2.5, y = trange[2], label = mlabs), data = timedata, size = 2.5, colour='black', hjust = 0, vjust = 0) p = --- On Sat, 10/17/09, m...@z107.de m...@z107.de wrote: From: m...@z107.de m...@z107.de Subject: Re: [R] Putting names on a ggplot To: John Kane jrkrid...@yahoo.ca Cc: R R-help r-h...@stat.math.ethz.ch Received: Saturday, October 17, 2009, 5:53 PM hi, On Sat, Oct 17, 2009 at 02:04:43PM -0700, John Kane wrote: Putting names on a ggplot p - p + geom_text(aes(x = namposts + 2.5, y = temprange[2], label = mlabs), data = year, size = 2.5, colour='black', hjust = 0, vjust = 0) you shouldn't use aes in this case since nampost, temprange, ... are not part of the dataframe year. It should also work with geom_text i guess, but I prefere annotate for thinks like that: p + annotate(text,x=namposts+2.5, y = temprange[2], label= mlabs,size=2.5,colour='black', hjust = 0, vjust = 0) regards, stefan __ Yahoo! Canada Toolbar: Search from anywhere on the web, and bookmark your favourite sites. Download it now http://ca.toolbar.yahoo.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using a custom color sequence for image()
On 10/17/2009 01:25 AM, Rajarshi Guha wrote: Hi, I'd like to use a custom color sequence (black - low values, green - high values) in am image() plot. While I can specify colors (say a sequence of grays) to the col argument, the ordering is getting messed up. I have two questions: 1. How can I get a sequence of say 256 colors starting from black and ending in green? 2. How is this specified to image() such that it uses the colors in the proper ordering? Hi Rajarshi, Adding to what Albert wrote, if you mean that you want black to gray and then gray to green, you can get it by creating two sequences of colors and then joining them together: ccolors-c(rgb(seq(0,0.5,length.out=128), seq(0,0.5,length.out=128),seq(0,0.5,length.out=128)), rgb(seq(0.5,0,length.out=128),seq(0.5,1,length.out=128), seq(0.5,0,length.out=128))) image(...,col=ccolors,...) See the examples in color2D.matplot for more information. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to plot multiple data sets with different colors (also with legend)?
Hi, the blue point is not shown simply because it is printed outside the current plot area. If you want to use the base graphics, you have to manually define the xlim and ylim of the plot. Legend is added with the command legend. E.g. x=rbind(c(10,11),c(10,11)) y=cbind(-1:0,-1:0) plot(y,col='yellow', xlim=c(-1,11), ylim=c(-1,11)) points(x,col='blue') legend(topleft, c(x,y), col=c('blue', 'yellow'), pch=1) This is nevertheless most easily done in ggplot2. E.g. library(ggplot2) # put the whole data in a data frame # and add a new variable to distinguish both dat - data.frame(rbind(x,y), var=rep(c('x','y'), each=2)) qplot(x=X1,y=X2, colour=var, data=dat) HTH, Matthieu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Putting names on a ggplot
--- On Sun, 10/18/09, m...@z107.de m...@z107.de wrote: From: m...@z107.de m...@z107.de Subject: Re: [R] Putting names on a ggplot To: John Kane jrkrid...@yahoo.ca Cc: R R-help r-h...@stat.math.ethz.ch Received: Sunday, October 18, 2009, 6:05 PM hello, On Sun, Oct 18, 2009 at 08:29:19AM -0700, John Kane wrote: Thanks Stefan, the annotate approach works beautifully. I had not got that far in Hadley's book apparently :( I'm not convinced though that the explaination you shouldn't use aes in this case since nampost, temprange, ... are not part of the dataframe year. makes sense since it seems to work in this case unless I am missing something obvious. Mind you I'm good at missing the obvious especially in ggplot. hmm, I thought that aes() is used to map variables from the used dataset (data=...) to parts of the plot. But obviously this is wrong, since your example below works. In my (maybe out of date) pdf version of Hadley's book an page 46 it says: |Any variable in an aes() specification must be contained inside the |plot or |layer data. This is one of the ways in which ggplot2 objects are |guaranteed |to be entirely self-contained, so that they can be stored and |re-used. I just realized I deleted Hadley's pdf so I cannot check this but it sounds right. so, maybe the problem with your version is, you map variables of your environment (namposts, trange) to aesthetic, which works fine in your current environment, but fails if you save the plot and load it in another environmento. | Error in eval(expr, envir, enclos) : object 'namposts' not found If you don't use save() on plots (I do not) this maybe not a problem for you. That's interesting. We're getting different error messages. I get p Error in data.frame(..., check.names = FALSE) : arguments imply differing number of rows: 12, 366 which makes your first idea seem more reasonable but I don't see why my other example would work if it was a data.frame problem :( Thanks for following up on this. I just installed 2.9.2 yesterday so I think I'm up to date on ggplot2 . Example = library(ggplot2) month - rep(month.abb[1:12], each=30) days - rep(1:30, 12) temps - rnorm(length(month), mean=25,sd=8) duration - 1:length(month) timedata - data.frame(month, days, temps, duration) head(timedata) mbs - c(1,seq(30, 360,by=30)) namposts - mbs[1:12] mlabs - month.name[1:12] trange - range(timedata$temps) drange - range(duration) p - ggplot(timedata, aes(duration, temps, colour=month)) + geom_line() + opts(legend.position = none, title=Yearly temperatures, axis.text.x = theme_blank(), axis.ticks = theme_blank()) p - p + geom_vline(xintercept= mbs) + ylab(Temperature (C)) + xlab(Daily Temperatures) + geom_text(aes(x = namposts+2.5, y = trange[2], label = mlabs), data = timedata, size = 2.5, colour='black', hjust = 0, vjust = 0) p = --- On Sat, 10/17/09, m...@z107.de m...@z107.de wrote: From: m...@z107.de m...@z107.de Subject: Re: [R] Putting names on a ggplot To: John Kane jrkrid...@yahoo.ca Cc: R R-help r-h...@stat.math.ethz.ch Received: Saturday, October 17, 2009, 5:53 PM hi, On Sat, Oct 17, 2009 at 02:04:43PM -0700, John Kane wrote: Putting names on a ggplot p - p + geom_text(aes(x = namposts + 2.5, y = temprange[2], label = mlabs), data = year, size = 2.5, colour='black', hjust = 0, vjust = 0) you shouldn't use aes in this case since nampost, temprange, ... are not part of the dataframe year. It should also work with geom_text i guess, but I prefere annotate for thinks like that: p + annotate(text,x=namposts+2.5, y = temprange[2], label= mlabs,size=2.5,colour='black', hjust = 0, vjust = 0) regards, stefan __ Yahoo! Canada Toolbar: Search from anywhere on the web, and bookmark your favourite sites. Download it now http://ca.toolbar.yahoo.com. __ Ask a question on any topic and get answers from real people. Go to Yahoo! Answe __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to plot multiple data sets with different colors (also with legend)?
On 10/19/2009 07:37 AM, Peng Yu wrote: The following commands only show the data in 'y'. I'm wondering how to show the data in 'x' as well. I also want to add a legend to show that blue points corresponds to 'x' and yellow points correspond to 'y'. Could somebody let me know what the correct commands should be? x=rbind(c(10,11),c(10,11)) y=cbind(-1:0,-1:0) plot(y,col='yellow') points(x,col='blue') Hi Peng, To show the x points, you will have to set both the xlim and ylim arguments: plot(y,col=yellow,xlim=c(-1,11),ylim=c(-1,11)) points(x,col=blue) I'm not sure why you are passing the points as matrices, but this means that the x points are the same. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rbind to array members
Hi, I would like to add rows to arbitrary tables within a 3dimensional array. I can directly add data to an existing row of a table: x - array(0,c(1,3,2)) x[,,1] - c(1,2,3) And I can even add a row to the table and assign to another object. y - rbind(x[,,1], c(4,5,6)) and 'y' is what I want it to be: y [,1] [,2] [,3][1,] 1 2 3[2,] 4 5 6 but I can't do this within the 3dimensional array: x[,,1] - rbind(x[,,1], c(4,5,6))Error in x[, , 1] - rbind(x[, , 1], c(4, 5, 6)) : number of items to replace is not a multiple of replacement length Does anyone know I can add rows to one of tables within the array? Thanks. _ Hotmail: Free, trusted and rich email service. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] CRAN (and crantastic) updates this week
CRAN (and crantastic) updates this week New packages * adaptTest (1.0) Marc Vandemeulebroecke http://crantastic.org/packages/adaptTest The functions defined in this program serve for implementing adaptive two-stage tests. Currently, four tests are included: Bauer and Koehne (1994), Lehmacher and Wassmer (1999), Vandemeulebroecke (2006), and the horizontal conditional error function. User-defined tests can also be implemented. Reference: Vandemeulebroecke, An investigation of two-stage tests, Statistica Sinica 2006. * bcv (1.0) Patrick O. Perry http://crantastic.org/packages/bcv This package implements methods for choosing the rank of an SVD approximation via cross validation. It provides both Gabriel-style block holdouts and Wold-style speckled holdouts. Also included is an implementation of the SVDImpute algorithm. For more information about Bi-cross-validation, see Owen Perry's 2009 AOAS article (at http://arxiv.org/abs/0908.2062) and Perry's 2009 PhD thesis (at http://arxiv.org/abs/0909.3052). * dse1 (2009.10-1) Paul Gilbert http://crantastic.org/packages/dse1 This package is only to aid transition to the unbundled dse package. It has no functions, but simply requires package EvalEst. * dse2 (2009.10-1) Paul Gilbert http://crantastic.org/packages/dse2 This package is only to aid transition to the unbundled dse package. It has no functions, but simply requires package EvalEst. * EvalEst (2009.10-2) Paul Gilbert http://crantastic.org/packages/EvalEst Multivariate Time Series - extensions.See ?00dse-Intro for more details. * glmdm (0.51) Jeff Gill http://crantastic.org/packages/glmdm R CODE FOR SIMULATION OF GLMDM * integrativeME (1.1) Kim-Anh Le Cao http://crantastic.org/packages/integrativeME Mixture of experts models (Jacobs et al., 1991) were introduced to account for nonlinearities and other complexities in the data. It is based on a divide-and-conquer strategy. Mixture of experts are of interest due to their wide applicability and the advantages of fast learning via the expectation-maximization (EM) algorithm. We have extended and implemented mixture of experts to combine categorical clinical factors and continuous microarray data in a binary classification framework to analyze cancer studies. To provide a hybrid signature of clinical factors and gene markers, we propose to apply different gene selection procedures as a first step. * integrativeMEdata (1.0) Kim-Anh Le Cao http://crantastic.org/packages/integrativeMEdata This package contains data sets with matched categorical clinical factors and microarray data for three cancer studies. This is part of the integrativeME package that combines these two types of variables in a binary classification framework by selecting a hybrid signature of clinical factors and gene markers. * latticedl (1.0) Toby Dylan Hocking http://crantastic.org/packages/latticedl Direct labeling functions that use the lattice package. * nodeHarvest (0.1) Nicolai Meinshausen http://crantastic.org/packages/nodeHarvest Node harvest is a simple interpretable tree-like estimator for high-dimensional regression and classification. A few nodes are selected from an initially large ensemble of nodes, each associated with a positive weight. New observations can fall into one or several nodes and predictions are the weighted average response across all these groups. The package offers visualization of the estimator. Predictions can return the nodes a new observation fell into, along with the mean response of training observations in each node, offering a simple explanation of the prediction. * sublogo (1.0) Toby Dylan Hocking http://crantastic.org/packages/sublogo Visualize correlation in biological sequence data using sublogo dendrogram plots. Updated packages adimpro (0.7.3), aplpack (1.2.2), approximator (1.1-6), aws (1.6-1), aylmer (1.0-4), BACCO (2.0-4), BB (2009.9-1), binMto (0.0-4), boot (1.2-41), bootspecdens (3.0), bqtl (1.0-25), CalciOMatic (1.1-3), calibrator (1.1-7), ccgarch (0.1.7), choplump (1.0), CircStats (0.2-4), clv (0.3-2), compare (0.2-3), condGEE (0.1-3), contfrac (1.1-8), Davies (1.1-5), deSolve (1.5), Devore7 (0.7.2), dlnm (1.0.2), dplR (1.1.9.4), drfit (0.05-95), dse (2009.10-1), dti (0.8-2), EDR (0.6-3), effects (2.0-9), effects (2.0-10), eha (1.2-12), eiPack (0.1-6), emulator (1.1-7), epicalc (2.9.2.8), FGN (1.2), FitAR (1.79), forward (1.0.3), fts (0.7.6), gamlss (3.0-1), gamlss.cens (3.0.1), gamlss.data (3.0-1), gamlss.dist (3.0-1), gamlss.mx (3.0-1), gamlss.nl (3.0-1), gamlss.tr (3.0-1), geoR (1.6-27), geoRglm (0.8-26), gld (1.8.4), glmmML (0.81-6), gplots (2.7.2), grouped (0.6-0), HAPim (1.3), hash (1.0.2), hdrcde (2.12), heplots (0.8-10), intervals (0.13.1), irtoys (0.1.2), isotone (0.8-7), ivivc (0.1.5), Kendall (2.1), kernlab (0.9-9), ks (1.6.8), lcda (0.2),
Re: [R] How to plot multiple data sets with different colors (also with legend)?
On Sun, Oct 18, 2009 at 5:42 PM, Matthieu Dubois matth...@gmail.com wrote: Hi, the blue point is not shown simply because it is printed outside the current plot area. If you want to use the base graphics, you have to manually define the xlim and ylim of the plot. Legend is added with the command legend. E.g. x=rbind(c(10,11),c(10,11)) y=cbind(-1:0,-1:0) plot(y,col='yellow', xlim=c(-1,11), ylim=c(-1,11)) points(x,col='blue') legend(topleft, c(x,y), col=c('blue', 'yellow'), pch=1) This is nevertheless most easily done in ggplot2. E.g. library(ggplot2) # put the whole data in a data frame # and add a new variable to distinguish both dat - data.frame(rbind(x,y), var=rep(c('x','y'), each=2)) qplot(x=X1,y=X2, colour=var, data=dat) qplot generates a figure with some background grid. If I just want a blank background (as in plot), what options should I specify? How to specific the color like 'red' and 'blue' explicitly? I have read the review for ggplot2 book on amazon. The rates are unanimously high. I want to know how much effort I should spend to learn ggplot2 versus conventional graphics R packages. Can ggplot2 do all the graphics tasks? Is it much easier to learn than conventional graphics packages? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error: object 'cloud' not found
Hi, I installed the lattice package so I can plot 3D cloud scatterplots: install.packages(lattice) But (after successfully installing from the Berkeley mirror), R insists it cannot find the cloud function, part of the lattice package: cloud Error: object 'cloud' not found What did I do wrong? Thanks! PT -- View this message in context: http://www.nabble.com/%22Error%3A-object-%27cloud%27-not-found%22-tp25952117p25952117.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error: object 'cloud' not found
On 19/10/2009, at 2:16 PM, PerfectTiling wrote: Hi, I installed the lattice package so I can plot 3D cloud scatterplots: install.packages(lattice) But (after successfully installing from the Berkeley mirror), R insists it cannot find the cloud function, part of the lattice package: cloud Error: object 'cloud' not found What did I do wrong? You didn't ***load*** the package, i.e. you didn't do library(lattice) cheers, Rolf Turner ## Attention:\ This e-mail message is privileged and confid...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error: object 'cloud' not found
library(lattice) Roy M. On Oct 18, 2009, at 6:16 PM, PerfectTiling wrote: Hi, I installed the lattice package so I can plot 3D cloud scatterplots: install.packages(lattice) But (after successfully installing from the Berkeley mirror), R insists it cannot find the cloud function, part of the lattice package: cloud Error: object 'cloud' not found What did I do wrong? Thanks! PT -- View this message in context: http://www.nabble.com/%22Error%3A-object-%27cloud%27-not-found%22-tp25952117p25952117.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ** The contents of this message do not reflect any position of the U.S. Government or NOAA. ** Roy Mendelssohn Supervisory Operations Research Analyst NOAA/NMFS Environmental Research Division Southwest Fisheries Science Center 1352 Lighthouse Avenue Pacific Grove, CA 93950-2097 e-mail: roy.mendelss...@noaa.gov (Note new e-mail address) voice: (831)-648-9029 fax: (831)-648-8440 www: http://www.pfeg.noaa.gov/ Old age and treachery will overcome youth and skill. From those who have been given much, much will be expected __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error: object 'cloud' not found
Hi PT, Try loading the lattice package first: # install.packages('lattice') require(lattice) ?cloud HTH, Jorge On Sun, Oct 18, 2009 at 9:16 PM, PerfectTiling wrote: Hi, I installed the lattice package so I can plot 3D cloud scatterplots: install.packages(lattice) But (after successfully installing from the Berkeley mirror), R insists it cannot find the cloud function, part of the lattice package: cloud Error: object 'cloud' not found What did I do wrong? Thanks! PT -- View this message in context: http://www.nabble.com/%22Error%3A-object-%27cloud%27-not-found%22-tp25952117p25952117.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rbind to array members
library(abind) ## array binding __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Why points() is defined specially for a 1 by 2 matrix?
x=cbind(1:4,3:6) png('one_point.png') plot(x[1:3,],xlim=c(-1,11),ylim=c(-1,11),pch=1) points(x[4,],pch=2)# this is plotted as two points #although I meant only one point legend(topleft, c(x,y),pch=c(1,2)) dev.off() The above code will produce 5 points instead of 4 points. If I want to have 4 points, I have to use the following code. But the below code is a little bit tedious. I'm wondering if there is a way that I still use the above code (with a little change) to generate 4 points? x=cbind(1:4,3:6) png('one_point_split.png') plot(x[1:3,1],x[1:3,2],xlim=c(-1,11),ylim=c(-1,11),pch=1) points(x[4,1],x[4,2],pch=2) legend(topleft, c(x,y),pch=c(1,2)) dev.off() __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Identifying similar but not identical rows in a dataframe
I would like to identify _almost_ duplicated rows in a data frame. For example, I might declare as duplicates pairs of rows that are alike at about 80% of their columns. When working with tens of thousands of rows and upwards of 20 columns an iterative approach, testing all permutations, can be time consuming. Duplicated() with incomparables sounds like the ticket. But previous discussion in this forum indicates that specifying an incomparable value when using duplicated() on a data frame is not yet implemented. Any suggestions about how to implement this efficiently would be appreciated. All data are numerical, and each datum could, for example, be reduced to a byte representation in a string. A fuzzy matching approach with agrep() might be possible. Thanks. __ Be smarter than spam. See how smart SpamGuard is at giving junk email the b [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Filtering on a dataframe- newbie question
Hi, newbie question. I have a data-frame with 3 named columns: Name, Obs1, Obs2. The Name column members are made of alphanumeric characters: T1, T2, T3 etc. I would like to acess only that subset of the data-frame with Name == T44. X - dataframe[dataframe$Name=='T44'] does not work. Any ideas on how to do this? I'm sure I'm missing a simple concept here. Thanks, Anjan -- = anjan purkayastha, phd bioinformatics analyst whitehead institute for biomedical research nine cambridge center cambridge, ma 02142 purkayas [at] wi [dot] mit [dot] edu 703.740.6939 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Filtering on a dataframe- newbie question
Hi there, Try ?subset ?str may also be useful.. bests milton On Sun, Oct 18, 2009 at 11:10 PM, ANJAN PURKAYASTHA anjan.purkayas...@gmail.com wrote: Hi, newbie question. I have a data-frame with 3 named columns: Name, Obs1, Obs2. The Name column members are made of alphanumeric characters: T1, T2, T3 etc. I would like to acess only that subset of the data-frame with Name == T44. X - dataframe[dataframe$Name=='T44'] does not work. Any ideas on how to do this? I'm sure I'm missing a simple concept here. Thanks, Anjan -- = anjan purkayastha, phd bioinformatics analyst whitehead institute for biomedical research nine cambridge center cambridge, ma 02142 purkayas [at] wi [dot] mit [dot] edu 703.740.6939 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to plot multiple data sets with different colors (also with legend)?
Hi Peng, Comments below. On Sun, Oct 18, 2009 at 9:22 PM, Peng Yu pengyu...@gmail.com wrote: On Sun, Oct 18, 2009 at 5:42 PM, Matthieu Dubois matth...@gmail.com wrote: Hi, the blue point is not shown simply because it is printed outside the current plot area. If you want to use the base graphics, you have to manually define the xlim and ylim of the plot. Legend is added with the command legend. E.g. x=rbind(c(10,11),c(10,11)) y=cbind(-1:0,-1:0) plot(y,col='yellow', xlim=c(-1,11), ylim=c(-1,11)) points(x,col='blue') legend(topleft, c(x,y), col=c('blue', 'yellow'), pch=1) This is nevertheless most easily done in ggplot2. E.g. library(ggplot2) # put the whole data in a data frame # and add a new variable to distinguish both dat - data.frame(rbind(x,y), var=rep(c('x','y'), each=2)) qplot(x=X1,y=X2, colour=var, data=dat) qplot generates a figure with some background grid. If I just want a blank background (as in plot), what options should I specify? How to specific the color like 'red' and 'blue' explicitly? You can get a more traditional look by issuing theme_set(theme_bw()) before the call to qplot(). The colors are controlled by the a scale, which you can override as follows: qplot(x=X1,y=X2, colour=var, data=dat) + scale_colour_manual(values = c(red,green)) I have read the review for ggplot2 book on amazon. The rates are unanimously high. I want to know how much effort I should spend to learn ggplot2 versus conventional graphics R packages. Can ggplot2 do all the graphics tasks? Is it much easier to learn than conventional graphics packages? ggplot2 can do most things that can be done in base graphics. It makes many things that are difficult in base easy, like faceting and mapping variables to a wide variety of scales. I myself use ggplot2 almost exclusively. I don't know base graphics at all, and I'm able to accomplish all my graphing needs with ggplot2. I would not say its easier than base graphics, just different. Some things are easier with base graphics, other things are easier with ggplot. I use it because I like the consistent and rational user interface (and the default theme is nice to look at). The place to start learning ggplot2 (while your're waiting for the book to be shipped perhaps) is http://had.co.nz/ggplot2/. -Ista __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why points() is defined specially for a 1 by 2 matrix?
points(x[4,],pch=2)# this is plotted as two points drops what it sees as an unnecessary dimension. Use points(x[4,, drop=FALSE], pch=2) See FAQ 7.5 tmp - matrix(1:2) tmp tmp[,1] tmp[,1,drop=FALSE] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] modifying model coefficients
I have build a model but want to then manipulate the coefficients in some way. I can extract the coefficients and do the changes I need, but how do I then put these new coefficients back in the model so I can use the predict function? my_model - lm(x ~ . , data=my_data) my_scores - predict(my_model, my_data) my_coeffs - coef(my_model) ## here we manipulate my_coeffs ## and then want to set the my_model ## coefficients to the new values so we ## predict using the new values my_model.coefs - my_coeffs ?? how is this done? ?? so that this will work with the new coefficients my_scores_new - predict(my_model, my_data) Any code snippets would be appreciated very much. -- View this message in context: http://www.nabble.com/modifying-model-coefficients-tp25952879p25952879.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Filtering on a dataframe- newbie question
On 19/10/2009, at 4:23 PM, milton ruser wrote: Hi there, Try ?subset No. Don't. Just do: X - dataframe[dataframe$Name=='T44',] Note the comma in the penultimate position. Read up on array indexing; see ?[ and An Introduction to R, section 5.2 for starters. A data frame is two dimensional; you were basically trying to treat it as one dimensional. And don't call your data frame ``dataframe''. Would you call your dog ``dog''? (Pace Barry Rowlingson! :-) ) cheers, Rolf Turner ?str may also be useful.. str is indeed useful --- but not here. R. T. bests milton On Sun, Oct 18, 2009 at 11:10 PM, ANJAN PURKAYASTHA anjan.purkayas...@gmail.com wrote: Hi, newbie question. I have a data-frame with 3 named columns: Name, Obs1, Obs2. The Name column members are made of alphanumeric characters: T1, T2, T3 etc. I would like to acess only that subset of the data-frame with Name == T44. X - dataframe[dataframe$Name=='T44'] does not work. Any ideas on how to do this? I'm sure I'm missing a simple concept here. Thanks, Anjan -- = anjan purkayastha, phd bioinformatics analyst whitehead institute for biomedical research nine cambridge center cambridge, ma 02142 purkayas [at] wi [dot] mit [dot] edu 703.740.6939 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r- project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. ## Attention:\ This e-mail message is privileged and confid...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] What is the difference between prcomp and princomp?
Some webpage has described prcomp and princomp, but I am still not quite sure what the major difference between them is. Can they be used interchangeably? In help, it says 'princomp' only handles so-called R-mode PCA, that is feature extraction of variables. If a data matrix is supplied (possibly via a formula) it is required that there are at least as many units as variables. For Q-mode PCA use 'prcomp'. What are R-mode and Q-mode? Are they just too different numerical methods to compute PCA? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] order of interaction coeff in lm
Hi. I haven't found this question asked elsewhere, so I hope I am not missing something trivial. y-rnorm(1:10) x1-rnorm(1:10) x2-rnorm(1:10) x3-rnorm(1:10) x4-rnorm(1:10) reg-lm(y~x1*x2+x3+x4) summary(reg) The output of this puts x1:x2 after x3 and x4. In my case this is very cumbersome because I have about 50 dummy variables instead of just x3 and x4. Is there any way I can get R to print out x1:x2 right after x1 and x2? I can always do: x1_x2-x1*x2 reg-lm(y~x1+x2+x1_x2+x3+x4) But I was wondering if there is something more elegant, since I like the interaction term denoted by a colon. Thanks. R 2.9.2/vista x64 -- View this message in context: http://www.nabble.com/order-of-interaction-coeff-in-lm-tp25952978p25952978.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What is the difference between prcomp and princomp?
On Sun, Oct 18, 2009 at 10:42 PM, Peng Yu pengyu...@gmail.com wrote: Some webpage has described prcomp and princomp, but I am still not quite sure what the major difference between them is. Can they be used interchangeably? In help, it says 'princomp' only handles so-called R-mode PCA, that is feature extraction of variables. If a data matrix is supplied (possibly via a formula) it is required that there are at least as many units as variables. For Q-mode PCA use 'prcomp'. What are R-mode and Q-mode? Are they just too different numerical methods to compute PCA? Also, it seems that 'loadings' of princomp is the same as 'rotation' of prcomp. I'm wondering whey they have different names in the two functions. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to create MULTILEVELS in a dataset??
Hi Saurav, I was waiting for someone else to answer you, because I'm not sure I'll be able to explain clearly. But since no one is jumping on it, I'll take a stab. On Sun, Oct 18, 2009 at 5:52 PM, saurav pathak pathak.sau...@gmail.com wrote: Dear R users I have a data set which has five variables. One depenedent variable y, and 4 Independent variables (education-level, householdincome, countrygdp and countrygdpsquare). The first two are data corresponding to the individual and the next two coorespond to the country to which the individual belongs to. My data set does not make this distinction between individual level and country level. Is there a way such that I can make R make countrygdp and countrygdpsquare at a different level than the individual level data. In other words I wish to transform my dataset such that it recognizes two individual level variables to be at Level-1 and the other two country level variables at Level-2. If you're using lmer I don't think you need to do anything special in terms of data preparation. You will need an explicit country code I think. I need to run a multilevel model, but first I must make my dataset recognise data at Level-1 and Level-2. How can I create this country level group (gdp and gdp^2) such that I can perform a multilevel model as follows: lmer(y ~ education-level + householdincome + countrygdp + countrygdpsquare + (1 I Level2),family=binomial(link=probit),data=dataset) I think you just need to specify country as the grouping variable: lmer(y ~ education-level + householdincome + countrygdp + countrygdpsquare + (1 I country),family=binomial(link=probit),data=dataset) Please kindly help me with the relevant commands for creating this Level2 (having two variables) I hope this helps -- I thinks it's less complicated than you were assuming. -Ista Thanks Saurav Dr.Saurav Pathak PhD, Univ.of.Florida Mechanical Engineering Doctoral Student Innovation and Entrepreneurship Imperial College Business School s.patha...@imperial.ac.uk 0044-7795321121 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why points() is defined specially for a 1 by 2 matrix?
Peng Yu wrote: On Sun, Oct 18, 2009 at 10:26 PM, Richard M. Heiberger r...@temple.edu wrote: points(x[4,],pch=2)# this is plotted as two points drops what it sees as an unnecessary dimension. Use points(x[4,, drop=FALSE], pch=2) See FAQ 7.5 tmp - matrix(1:2) tmp tmp[,1] tmp[,1,drop=FALSE] Can I specify 'drop' to FALSE by default so that I don't have to specify it explicitly? Not that I know of, but things will be easier with the following idiom: x = cbind(1:4,3:6) plot(x[,1],x[,2],pch=rep(1:2,c(3,1))) even easier if the plot types correspond to a factor variable in a data frame: dat = data.frame(x=1:4,y=3:6,type=factor(c(1,1,1,2))) with(dat,plot(x,y,pch=as.numeric(type))) To answer one of your other questions: ggplot (and lattice) is/are very powerful, but base graphics are (a) easier to get your head around and (b) easier to adjust if you don't like the defaults. Changing things just a little bit in ggplot can be difficult (as an example, the answer to your other question about getting rid of grid lines has to do with theme_blank(), something like +options(grid.panel.minor=theme_blank()) [try googling theme_blank for a few examples]) -- View this message in context: http://www.nabble.com/Why-points%28%29-is-defined-specially-for-a-1-by-2-matrix--tp25952698p25953122.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.