[R] Arguments to lm() within a function - object not found
Hi all, I'm having some difficulty passing arguments into lm() from within a function, and I was hoping someone wiser in the ways of R could tell me what I'm doing wrong. I have the following: lmwrap <- function(...) { wts <- somefunction() print(wts) # This works, wts has the values I expect fit <- lm(weights=wts,...) return(fit) } If I call my function lmwrap, I get the the following error: > lmwrap(a~b) Error in eval(expr, envir, enclos) : object "wts" not found A traceback gives me the following: 8: eval(expr, envir, enclos) 7: eval(extras, data, env) 6: model.frame.default(formula = ..1, weights = wts, drop.unused.levels = TRUE) 5: model.frame(formula = ..1, weights = wts, drop.unused.levels = TRUE) 4: eval(expr, envir, enclos) 3: eval(mf, parent.frame()) 2: lm(weights = wts, ...) 1: wraplm(a ~ b) It seems like whatever environment lm is trying to eval wts in doesn't have it defined. Could anyone tell me what I'm doing wrong? As a sidenote, I do have a workaround, but this strikes me as really the wrong thing to do. I replace the call to lm with: eval(substitute(lm(weights = dummy,...),list(dummy=wts))) which works. Thanks Pete __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Arguments to lm() within a function - object not found
Hi all, I'm having some difficulty passing arguments into lm() from within a function, and I was hoping someone wiser in the ways of R could tell me what I'm doing wrong. I have the following: lmwrap <- function(...) { wts <- somefunction() print(wts) # This works, wts has the values I expect fit <- lm(weights=wts,...) return(fit) } If I call my function lmwrap, I get the the following error: > lmwrap(a~b) Error in eval(expr, envir, enclos) : object "wts" not found A traceback gives me the following: 8: eval(expr, envir, enclos) 7: eval(extras, data, env) 6: model.frame.default(formula = ..1, weights = wts, drop.unused.levels = TRUE) 5: model.frame(formula = ..1, weights = wts, drop.unused.levels = TRUE) 4: eval(expr, envir, enclos) 3: eval(mf, parent.frame()) 2: lm(weights = wts, ...) 1: wraplm(a ~ b) It seems like whatever environment lm is trying to eval wts in doesn't have it defined. Could anyone tell me what I'm doing wrong? As a sidenote, I do have a workaround, but this strikes me as really the wrong thing to do. I replace the call to lm with: eval(substitute(lm(weights = dummy,...),list(dummy=wts))) which works. Thanks Pete __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Arguments to lm() within a function - object not found
Thanks very much for the quick reply. I had looked at the help for lm, but I clearly skimmed over the critical part explaining where weights is evaluated. Thanks, Pete On 13/8/2008, Prof Brian Ripley wrote: >On Wed, 13 Aug 2008, Pete Berlin wrote: > >> Hi all, >> >> I'm having some difficulty passing arguments into lm() from within a >> function, and I was hoping someone wiser in the ways of R could tell me >> what I'm doing wrong. I have the following: >> >> lmwrap <- function(...) { >> >> wts <- somefunction() >> print(wts) # This works, wts has the values I expect >> fit <- lm(weights=wts,...) >> >> return(fit) >> } >> >> If I call my function lmwrap, I get the the following error: >> >>> lmwrap(a~b) >> Error in eval(expr, envir, enclos) : object "wts" not found > >Correct. The help (?lm) says > > All of 'weights', 'subset' and 'offset' are evaluated in the same > way as variables in 'formula', that is first in 'data' and then in > the environment of 'formula'. > > >> >> A traceback gives me the following: >> >> 8: eval(expr, envir, enclos) >> 7: eval(extras, data, env) >> 6: model.frame.default(formula = ..1, weights = wts, drop.unused.levels = >> TRUE) >> 5: model.frame(formula = ..1, weights = wts, drop.unused.levels = TRUE) >> 4: eval(expr, envir, enclos) >> 3: eval(mf, parent.frame()) >> 2: lm(weights = wts, ...) >> 1: wraplm(a ~ b) >> >> It seems like whatever environment lm is trying to eval wts in doesn't >> have it defined. >> >> Could anyone tell me what I'm doing wrong? >> >> As a sidenote, I do have a workaround, but this strikes me as really the >> wrong thing to do. I replace the call to lm with: >> eval(substitute(lm(weights = dummy,...),list(dummy=wts))) >> which works. > >It's one workaround, but working with the scoping rules is better. Hint: >use the 'data' argument to lm. > __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] trouble loading candisc
Hello, I am having trouble loading the package candisc onto my R distribution. I am using 2.7.1-2. I do a "> install.packages("candisc" and get the following output. Warning in install.packages("candisc") : argument 'lib' is missing: using '/usr/local/lib/R/site-library' --- Please select a CRAN mirror for use in this session --- Loading Tcl/Tk interface ... done trying URL ' ftp://ftp.u-aizu.ac.jp/pub/lang/R/CRAN/src/contrib/candisc_0.5-10.tar.gz' ftp data connection made, file length 27354 bytes opened URL == downloaded 26 Kb * Installing *source* package 'candisc' ... ** R ** data ** moving datasets to lazyload DB ** inst ** preparing package for lazy loading Loading required package: car Loading required package: heplots ** help >>> Building/Updating help pages for package 'candisc' Formats: text html latex example Grass texthtmllatex example Note: removing empty section \details HSB texthtmllatex example Note: removing empty section \examples candisc-package texthtmllatex candisc texthtmllatex example candiscList texthtmllatex example heplot.candisctexthtmllatex example heplot.candiscListtexthtmllatex ** building package indices ... * DONE (candisc) The downloaded packages are in /tmp/RtmpzNvnNN/downloaded_packages but I try to run candisc in R and it does not appear to be there? Pedro [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] candisc plotting
Hello, I have a file with two dependent variables (three and five) and one independent variable. I do i.mod <- lm(cbind(three, five) ~ species, data=i.txt) and get the following output: Coefficients: three five (Intercept) 9.949 9.586 species -1.166 -1.156 I do a" i.can<-candisc(i.mod,data=i): and get the following output: Canonical Discriminant Analysis for species: CanRsq Eigenvalue Difference Percent Cumulative 1 0.0965060.10681100100 Test of H0: The canonical correlations in the current row and all that follow are zero LR test stat approx F num Df den Df Pr(> F) 10.903 63.875 1598 6.859e-15 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 this is different than the output I get with SAS: Eigenvalue Difference Proportion Cumulative Ratio F Value Num DF Den DF Pr > F 1 0.10681. 1. 0.90349416 31.88 2597 <.0001 I am also wondering how to plot the can1*can1 like it is done in SAS. proc plot; plot can1*can1=species; format species spechar.; title2 'Plot of Constits_vs_cassettes'; run; Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] candisc plotting
Dear Michael, You haven't told us what your data is, and we can only surmise -- not very helpful for you and annoying for those who try to help. Apologies, I am brand new to R and this mailing list. Will try to be more concise. Here is my data a NEW verion of my data: Curvature Diameter Quality 1 2.95 6.63Passed 2 2.53 7.79Passed 3 3.57 5.65Passed 4 3.16 5.47Passed 5 2.58 4.46 NotPassed 6 2.16 6.22 NotPassed 7 3.27 3.52 NotPassed What I am trying to get from the candisc method is a 1 dimensional scatterplot that separates my two groups Passed and NotPassed On this data I do a "do.mod <- lm(cbind(Diameter, Curvature) ~ Quality, data=do)" >do.mod produces Coefficients: Diameter Curvature (Intercept)4.73332.6700 QualityPassed 1.65170.3825 I then run the "candisc" method: "do.can <- candisc(do.mod, data=do)" this produces: Canonical Discriminant Analysis for Quality: CanRsq Eigenvalue Difference Percent Cumulative 1 0.91354 10.566100100 Test of H0: The canonical correlations in the current row and all that follow are zero LR test stat approx F num Df den Df Pr(> F) 10.086 52.831 1 5 0.0007706 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 What "I think" I would like to plot is the discriminant function of each sample 1-7. Here is an example of what I am trying to do with candisc. http://people.revoledu.com/kardi/tutorial/LDA/Numerical%20Example.html Thanks On Thu, Dec 11, 2008 at 3:36 PM, Michael Friendly wrote: > Dear Pete, > > You haven't told us what your data is, and we can only surmise -- not very > helpful for you and annoying for those who try to help. > > Pete Shepard wrote: > >> Hello, >> >> I have a file with two dependent variables (three and five) and one >> independent variable. I do i.mod <- lm(cbind(three, five) ~ species, >> data=i.txt) and get the following output: >> >> >> Coefficients: >> three five >> (Intercept) 9.949 9.586 >> species -1.166 -1.156 >> > From this, it seems that species is numeric variable, not a factor. > If so, canonical discriminant analysis in not appropriate, so > all following bets are off. > > That's likely why you end up with only one canonical dimension. > > > I do a" i.can<-candisc(i.mod,data=i): >> > Is data=i the same as data=i.txt? > >> >> and get the following output: >> >> Canonical Discriminant Analysis for species: >> >>CanRsq Eigenvalue Difference Percent Cumulative >> 1 0.0965060.10681100100 >> >> Test of H0: The canonical correlations in the >> current row and all that follow are zero >> LR test stat approx F num Df den Df Pr(> F) >> 10.903 63.875 1598 6.859e-15 *** >> --- >> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 >> >> this is different than the output I get with SAS: >> > What was your SAS code? Was the data the same? > >> >> Eigenvalue Difference Proportion Cumulative Ratio F Value >> Num DF Den DF Pr > F >> >> 1 0.10681. 1. 0.90349416 >> 31.88 2597 <.0001 >> > > > >> I am also wondering how to plot the can1*can1 like it is done in SAS. >> >> proc plot; >>plot can1*can1=species; >>format species spechar.; >>title2 'Plot of Constits_vs_cassettes'; >> run; >> >> If you want to compare plots for canonical analysis in SAS and R, > see my macros, canplot and hecan at > http://www.math.yorku.ca/SCS/sasmac/ > > But in general, if all you have is 1 canonical dimension, a dotplot or > boxplot of the canonical scores would be more useful than a scatterplot > plot of can1 * can1. > > The plot method for candisc objects in the candisc package has some > code to handle the 1 can-D case. > > hope this helps > -Michael > >> Thanks >> >>[[alternative HTML version deleted]] >> >> __ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> > > -- > Michael Friendly Email: friendly AT yorku DOT ca > Professor, Psychology Dept. > York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 > 4700 Keele Streethttp://www.math.yorku.ca/SCS/friendly.html > Toronto, ONT M3J 1P3 CANADA > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] candisc
Hello, I have a question regarding the candisc package. My data are: speciesthreefive 12.956.63 12.537.79 13.575.65 13.165.47 22.584.46 22.166.22 23.273.52 I put these in a table and then a linear model >newdata <- lm(cbind(three, five) ~ species, data=rawdata) and then do a candisc on them >candata<-candisc(newdata) Here are my scores; >candata$scores species Can1 1 1 -2.3769280 2 1 -2.7049437 3 1 -3.4748309 4 1 -0.9599825 5 2 4.2293774 6 2 2.6052193 7 2 2.6820884 and here are my coefficients > candata$coeffs.raw Can1 three -5.185380 five -2.160237 > candata$coeffs.std Can1 three -2.530843 five -2.586620 My question is, what is the precise equation that gives the candata$scores? Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] discrete variable
Hello, I am sorry for asking such a basic question. I could not find an answer to it using google. I have a discrete variable (a vector x) taking for example the following values : 0, 3, 4, 3, 15, 5, 6, 5 Is it possible to know how many different values (modalities) it takes ? Here it takes 6 different values but the length of the vector is 8. I would like to know if there is a way to get the set of the modalities {0,3,4,15,5,6} with the number of times each one is taken {1,2,1,1,2,1} Thank you very much P.S. : is there some useful functions for discrete variables ? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] vector manipulations
Hello, I have simulated a set of data which i called "nir" (a vector). I have created a function "logl" which calculates the log-likelihood. logl is a function of 2 real parameters : "beta" and "zeta" (of length 1). This function works perfectly well when I try for example "logl(0.1,0.2)" Now if I try : "x=seq(0.1,0.5,by=10^(-1)) y=seq(0.1,0.5,by=10^(-1)) z=outer(x,y,logl)" I get an error. The problem seems to be that inside "logl", the following expression is calculated : "sum( log( beta+(nir-1)*zeta ) )". So it is a vector manipulation. The error tells me that "nir" is not the size of "zeta". Yet usually it is no problem since "length(zeta)=1". When I replace "sum( log( beta+(nir-1)*zeta ) )" by a loop, I get no mistake. But I think it slows down the program. Do you have an idea where the problem is ? Thank you very much. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] vector manipulations
Thank you very much to both of you, and especially you Phil. I will tell you if it works. 2008/3/4, Phil Spector <[EMAIL PROTECTED]>: > > Pete - > As others have told you, outer only works with vectorized > functions. An alternative is to use expand.grid to find all > the combinations of beta and zeta, and then use apply to > calculate your likelihood for each row. I believe that this > will work: > > allvals = expand.grid(beta=seq(0.1,0.5,by=10^(-1)),zeta=seq(0.1,0.5 > ,by=10^(-1))) > answer = cbind(allvals,result = > apply(allvals,1,function(x)logl(x[1],x[2]))) > > The columns of answer will be named beta, zeta and result, with > (hopefully) obvious meanings. > > - Phil Spector > Statistical Computing Facility > Department of Statistics > UC Berkeley > [EMAIL PROTECTED] > > > > > On Tue, 4 Mar 2008, Pete Dorothy wrote: > > > Hello, > > > > I have simulated a set of data which i called "nir" (a vector). > > > > I have created a function "logl" which calculates the log-likelihood. > > > > logl is a function of 2 real parameters : "beta" and "zeta" (of length > 1). > > > > This function works perfectly well when I try for example "logl(0.1,0.2 > )" > > > > Now if I try : > > > > "x=seq(0.1,0.5,by=10^(-1)) > > y=seq(0.1,0.5,by=10^(-1)) > > z=outer(x,y,logl)" > > > > I get an error. > > > > The problem seems to be that inside "logl", the following expression is > > calculated : "sum( log( beta+(nir-1)*zeta ) )". So it is a vector > > manipulation. The error tells me that "nir" is not the size of "zeta". > Yet > > usually it is no problem since "length(zeta)=1". > > > > When I replace "sum( log( beta+(nir-1)*zeta ) )" by a loop, I get no > > mistake. But I think it slows down the program. > > > > Do you have an idea where the problem is ? > > > > Thank you very much. > > > > > [[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] vector manipulations
Thank you everybody. Phil, your expand.grid works very nicely and I will use it for non-vectorized functions. Yet I am a bit confused about "vectorization". For me it is synonymous of "no loop". :-( I wrote a toy example (with a function which is not my log-likelihood). FIRST PART nir=1:10 logl=function(x,y,nir) sum(log(x*nir+y)) x=seq(0.1,0.3,by=10^(-1)) y=seq(0.1,0.3,by=10^(-1)) z=outer(x,y,logl,nir=nir) This does not work. Can you explain me why it is not "vectorised" ? SECOND PART nir=1:10 logl2=function(x,y,nir) { a=0 for (i in 1:10) { a=a+log(x*nir[i]+y) } return(a) } x=seq(0.1,0.3,by=10^(-1)) y=seq(0.1,0.3,by=10^(-1)) z2=outer(x,y,logl2,nir=nir) This seems to work, though the function does not seem to be vectorized. I am sorry for being such a noob. I'm ok in maths but I am bad at programming. I bought a book on R (Introductory Statistics with R by Dalgaard) ** on Amazon last week . I will read it when I receive it. Do you know other good books ? 2008/3/5, [EMAIL PROTECTED] <[EMAIL PROTECTED]>: > > No problems with it working. The main problem I have observed is > unrealistic expectations. People write an *essentially* non-vectorized > function and expect Vectorize to produce a version of it which will > out-perform explicit loops every time. No magic bullets in this game. > > Bill. > > > > Bill Venables > CSIRO Laboratories > PO Box 120, Cleveland, 4163 > AUSTRALIA > Office Phone (email preferred): +61 7 3826 7251 > Fax (if absolutely necessary): +61 7 3826 7304 > Mobile: +61 4 8819 4402 > Home Phone: +61 7 3286 7700 > mailto:[EMAIL PROTECTED] > http://www.cmis.csiro.au/bill.venables/ > > -Original Message- > > From: Duncan Murdoch [mailto:[EMAIL PROTECTED] > Sent: Wednesday, 5 March 2008 9:36 AM > To: Venables, Bill (CMIS, Cleveland) > Cc: r-help@r-project.org > Subject: Re: [R] vector manipulations > > On 3/4/2008 5:41 PM, [EMAIL PROTECTED] wrote: > > Your problem is that your function log1( , ) is not vectorized with > > respect to its arguments. For a function to work in outer(...) it > must > > accept vectors for its first two arguments and it must produce a > > parallel vector of responses. > > > > To quote the help information for outer: > > > > "FUN is called with these two extended vectors as arguments. > Therefore, > > it must be a vectorized function (or the name of one), expecting at > > least two arguments." > > > > Sometimes Vectorize can be used to make a non-vectorized function into > a > > vectorized one, but the results are not always entirely satisfactory > in > > my experience. > > What problems have you seen? > > Duncan Murdoch > > __ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] getting % influence for 2 factors in LDA
Hello R-list, I am preforming an lda on the following data. Curvature Diameter Quality 1 2.95 6.63Passed 2 2.53 7.79Passed 3 3.57 5.65Passed 4 3.16 5.47Passed 5 2.58 4.46 NotPassed 6 2.16 6.22 NotPassed 7 3.27 3.52 NotPassed I use lda: > ddd<-lda(Quality~ Curvature+Diameter, data=momo) > ddd Call: lda(Quality ~ Curvature + Diameter, data = momo) Prior probabilities of groups: NotPassedPassed 0.5 0.5 Group means: Curvature Diameter NotPassed 2.63 5.38 Passed 3.016667 6.69 AND lda$scaling: gives me LD1 Curvature 2.372923 Diameter 1.368995 My question is, I would like to calculate the % influence that 1 factor has vs the other meaning does 1 factor have 0.6 influence and the other have 0.4? TIA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] heatmap and automatic box sizes
Dear R, I have a list of X,Y coordinates and a ratio associated with each coordinate. The X and Y coordinates are continuous but random from 50-500, I would like to make a continuous heatmap of the ratios at each coordinate. One caveat is that the coordinates are clustered together do some bixes might have too little data. I am wondering if there is a way that R can automatically adjust box sizes? Sample data set is below TIA XYRATIO 5056.1 5059.1 5254.2 500393.9 45036.7 250190.7 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.