Re: [R] FW: How to fit an linear model withou intercept
Hi Mark, as last comment you may also take a look at ?summary.lm where you will notice, that R reports two different R squares depending on the presence or absence of an intercept term. For comparison issues you should ensure that you use the same mathematical object. There was a thread about this (from where I took essentially Prof. Ripley reply for this answer) in Jan 2006, as you see in http://tolstoy.newcastle.edu.au/R/help/06/01/18923.html hth. Leeds, Mark (IED) schrieb: > Eik : Today, I've been reading Myers text , "classical and modern regression > with applications" to refresh my memory > about regression because it's been a while since I looked at that material. > The usbtraction of the means from > Both sides of the equation causing the intercept to be zero now makes more > sense because, in the simple regression > case, > > b0 = y bar - b1 x bar and, by subtracting the means, y bar and x bar both > become zero, so b0 = zero. > > If you have any other comments, they are very appreciated and always invited > but I think between what you showed and above, > it's clearer now. I think I will go with centering both the left and right > hand sides to force the zero intercepts, estimate > each model with the intercept ( which will hopefully numerically estimate the > intercept as very close to zero ) and then compare > the RSquareds of the two models. If you still see this as a problem, let me > know because I am totally open to listening to other > people's brains , especially good ones like yours. > > > > -Original Message- > From: Eik Vettorazzi [mailto:[EMAIL PROTECTED] > Sent: Tuesday, August 28, 2007 8:33 AM > To: Leeds, Mark (IED) > Cc: R-help > Subject: Re: FW: [R] How to fit an linear model withou intercept > > Hi Mark, > I don't know wether you recived a sufficient reply or not, so here are my > comments to your problem. > Supressing the constant term in a regression model will probably lead to a > violation of the classical assumptions for this model. > From the OLS normal equations (in matrix notation) > (1) (X'X)b=X'y > and the definition of the OLS residuals > (2) e = y-Xb > you get - by substituting y form (2) in (1) >(X'X)b=(X'X)b+X'e > and hence >X'e =0. > Without a constant term you cannot assure, that the ols residuals > e=(y-Xb) will have zero mean, wich holds when involving a constant term, > since the first equation of X'e = 0 gives in this case sum(e)=0. > > For decomposing the TSS (y'y) into ESS (b'X'Xb) and RSS (e'e), which is > needed to compute R², you will need X'e=0, because then the cross-product > term b'X'e vanishes. > Correct me if I'm wrong. > > Leeds, Mark (IED) schrieb: > >> Park, Eik : Could you start from the bottom and read this when you >> have time. I really appreciate it. >> >> Basically, in a nutshell, my question is the "Hi John" part and I want >> to do my study correctly. Thanks a lot. >> >> >> >> -Original Message- >> From: Leeds, Mark (IED) >> Sent: Thursday, August 23, 2007 1:05 PM >> To: 'John Sorkin' >> Cc: '[EMAIL PROTECTED]' >> Subject: RE: [R] How to fit an linear model withou intercept >> >> Hi John : I'm from the R-list obviously and that was a nice example >> that I cut and pasted and learned from. I'm Sorry to bother you but I >> had a non R question that I didn't want to pose to the R-list because >> I think It's been discussed a lot in the past but I never focused on >> the discussion. >> >> I need to do a study where I decide between two different univariate >> regressions models. The LHS is the same in both cases and it's not the >> goal of the study to build a prediction model but rather to see which >> RHS ( univariate ) explains the LHS better. >> It's actually in a time series framework also but that's not relevant >> for my question. My question has 2 parts : >> >> 1) I was leaning towards using the R squared as the decision criteria >> ( I will be Regressing monthly and over a couple of years so I will >> have about 24 rsquareds. I have tons of data For one monthly >> regression so I don't have to just do one big regression over the >> whole time period ) but I noticed in your previous example that the >> model with intercept ( compared to the model forced to have zero >> intercept ) had a lower R^2 and a lower standard error at the same >> time ! So this asymmetry leads me to think that may
Re: [R] FW: How to fit an linear model withou intercept
Hi Mark, I don't know wether you recived a sufficient reply or not, so here are my comments to your problem. Supressing the constant term in a regression model will probably lead to a violation of the classical assumptions for this model. From the OLS normal equations (in matrix notation) (1) (X'X)b=X'y and the definition of the OLS residuals (2) e = y-Xb you get - by substituting y form (2) in (1) (X'X)b=(X'X)b+X'e and hence X'e =0. Without a constant term you cannot assure, that the ols residuals e=(y-Xb) will have zero mean, wich holds when involving a constant term, since the first equation of X'e = 0 gives in this case sum(e)=0. For decomposing the TSS (y'y) into ESS (b'X'Xb) and RSS (e'e), which is needed to compute R², you will need X'e=0, because then the cross-product term b'X'e vanishes. Correct me if I'm wrong. Leeds, Mark (IED) schrieb: > Park, Eik : Could you start from the bottom and read this when you have > time. I really appreciate it. > > Basically, in a nutshell, my question is the "Hi John" part and I want > to do my study correctly. Thanks a lot. > > > > -Original Message- > From: Leeds, Mark (IED) > Sent: Thursday, August 23, 2007 1:05 PM > To: 'John Sorkin' > Cc: '[EMAIL PROTECTED]' > Subject: RE: [R] How to fit an linear model withou intercept > > Hi John : I'm from the R-list obviously and that was a nice example > that I cut and pasted and learned from. I'm Sorry to bother you but I > had a non R question that I didn't want to pose to the R-list because I > think It's been discussed a lot in the past but I never focused on the > discussion. > > I need to do a study where I decide between two different univariate > regressions models. The LHS is the same in both cases and it's not the > goal of the study to build a prediction model but rather to see which > RHS ( univariate ) explains the LHS better. > It's actually in a time series framework also but that's not relevant > for my question. My question has 2 parts : > > 1) I was leaning towards using the R squared as the decision criteria ( > I will be Regressing monthly and over a couple of years so I will have > about 24 rsquareds. I have tons of data For one monthly regression so I > don't have to just do one big regression over the whole time period ) > but I noticed in your previous example that the model with intercept ( > compared to the model forced to have zero intercept ) had a lower R^2 > and a lower standard error at the same time ! So this asymmetry > leads me to think that maybe I should be using standard error rather > than Rsquared as my criteria ? > > 2) This is possibly related to 1 : Isn't there a problem with using the > Rsquared for anything when you force no intercept ? > I think I remember seeing discussions about this on the list. That's why > I was thinking of including the intercept. > ( intercept in my problem really has no meaning but I wanted to retain > the validity of the Rsquared ) But, now that I see your email, maybe I > should be still including an intercept and using standard error as the > criteria. > Or maybe when you include an intercept ( in both cases ) you don't get > this asymmetry between Rsquared and standrd error. > I was surprised to see the asymmetry but maybe it happens because one > is comparing model with intercept to a model without intercept and no > intercept probably renders the rsquared critieria meaningless in the > latter. > > Thanks for any insight you can provide. I can also center and go without > intercept because it sounded like you DEFINITELY preferred that Method > over just not including an intercept at all. I was thinking of sending > this question to the R-list but I didn't want to get hammered because I > know that this is not a new discussion. Thanks so much. > > > > Mark > > P.S : How the heck did you get an MD and a Ph.D ? Unbelievable. Did you > do them at the same time ? > > > > > -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of John Sorkin > Sent: Thursday, August 23, 2007 9:29 AM > To: David Barron; Michal Kneifl; r-help > Subject: Re: [R] How to fit an linear model withou intercept > > Michael, > Assuming you want a model with an intercept of zero, I think we need to > ask you why you want an intercept of zero. When a "normal" regression > indicates a non-zero intercet, forcing the regression line to have a > zero intercept changes the meaning of the regression coefficients. If > for some reason you want to have a zero intercept, but do not want to > change the meaning of the regression coefficeints, i.e. you still what > to minimize the sum of the square deviations from the BLUE (Best > Leastsquares Unibiased Estimator) of the regression, you can center your > dependent and indepdent variables re-run the regression. Centering means > subtracting the mean of each variable from the variable before > performing the regression. When you do this, the int
Re: [R] Help with vector gymnastics
try 5*which(tf)[cumsum(tf)] Gladwin, Philip schrieb: > Hello, > > What is the best way of solving this problem? > > answer <- ifelse(tf=TRUE, i * 5, previous answer) > where as an initial condition > tf[1] <- TRUE > > > For example if, > tf <- c(T,F,F,F,T,T,F) > over i = 1 to 7 > then the output of the function will be > answer = 5 5 5 5 25 30 30 > > Thank you. > > Phil, > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Eik Vettorazzi Institut für Medizinische Biometrie und Epidemiologie Universitätsklinikum Hamburg-Eppendorf Martinistr. 52 22046 Hamburg T ++49/40/42803-8243 F ++49/40/42803-7790 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] symbolic matrix elements...
test=matrix(c( expression(x^3-5*x+4), expression(log(x^2-4*x works. btw. you recieved an error because D expects an expression and you offered a list > class(test[1]) [1] "list" to get the error relating to the misuse of the tilde operator you have to prompt the "correct" extractor "[[" f<-test[[1]] D(f,"x") Am Mon, 18 Sep 2006 18:30:57 +0200 schrieb Evan Cooch <[EMAIL PROTECTED]>: > Normally, I do symbolics in Maple, or Mathematica, but I'm trying to > write a simple script for students to handle some *very* simple > calculations (for other purposes) with matrix or vector elements, where > the elements are coded symbolically. What I've tried with *partial" > success is use of the tilde (~) operator. So, for example, consider a > simple vector: > > test=matrix(c(~ x^3-5*x+4, ~log(x^2-4*x))) > > Now, when I look at test, I see > > > test > [,1] > [1,] Expression > [2,] Expression > > Fine. When I try to extract one of the vector elements, I see (for > example) > > > test[1] > [[1]] > ~x^3 - 5 * x + 4 > > > Fine - but now I'm trying to figure out how to use the extracted matrix > element for anything else. For example, using D for simple symbolic > derivatives > > f <- test[1]; > D(f,"x") > > should *in theory* work, but I get the following: > > > D(f,"x"); > [1] NA > > But, if I try > > f <- expression(x^3-5*x+4); > D(f,"x"); > > works fine. > > So, even though it looks as if each element of test is coded as an > expression, it seems as though it is somehow a different type of > expression than if I code it explicitly as an expression. I'm *guessing* > it has to do with the tilde operator not assigning the formula to > anything, but I'm not sure. > > Suggestions? Pointers to the obvious? > > Thanks! > -- Universität Hamburg Institut für Statistik und Ökonometrie Dipl.-Wi.-Math. Eik Vettorazzi Von-Melle-Park 5 20146 Hamburg Tel.: +49 40-42838-3540 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] negatively skewed data; reflecting
a simple reflection (on the y-axis) of x is -x, but you have to ensure that there are only nonnegative numbers if you want to use the log transformation. So you should reflect on a postive number z greater than abs(min(x)), if min(x)<0. This is done by z-x. Why don't you simply shift your data by this amount z or use a box-cox-transformation at all? Am Wed, 23 Aug 2006 14:08:08 +0200 schrieb <[EMAIL PROTECTED]>: > Hi, > > This problem may be very easy, but I can't think of how to do it. I > have constructed histograms of various variables in my dataset. Some of > them are negatively skewed, and hence need data transformations > applied. I know that you first need to reflect the negatively skewed > data and then apply another transformation such as log, square root etc > to bring it towards normailty. How is it that I reflect data in R? I'm > sorry if this seems a very simple task, I think it involves going back > to Maths GCSE and relearning reflection, rotation, translation etc! I > have searched the internet, but cannot come up with anything useful on > how to reflect data. > >> hist(Lsoc) #how do I reflect Lsoc in R? > > I am grateful for any help regarding this matter, it is just a very > small part of my analysis and doesn't seem worth agonising hours over. > I will probably kick myself when someone tells me the answer! > > Thank you very much, > > Zoe > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Vectorizing a "for" loop
res<-outer(rows,columns,FUN=function(x,y) abs(x-y)) will help you. Am Thu, 03 Aug 2006 16:10:46 +0200 schrieb Daniel Gerlanc <[EMAIL PROTECTED]>: > Hello all, > > Consider the following problem: > > There are two vectors: > > rows <- c(1, 2, 3, 4, 5) > columns <- c(10, 11, 12, 13, 14) > > I want to create a matrix with dimensions length(rows) x length(columns): > > res <- matrix(nrow = length(rows), ncol = length(columns)) > > If "i" and "j" are the row and column indexes respectively, the values > of the cells are abs(rows[i] - columns[j]). The resultant matrix > follows: > > [,1] [,2] [,3] [,4] [,5] > [1,]9 10 11 12 13 > [2,]8910 11 12 > [3,]78 9 10 11 > [4,]67 8 9 10 > [5,]56 7 89 > > This matrix may be generated by using a simple "for" loop: > > for(i in 1:length(rows)){ > for(j in 1:length(columns)){ > res[i,j] <- abs(rows[i] - columns[j]) > } > } > > Is there a quicker, vector-based approach for doing this or a function > included in the recommended packages that does this? > > Thanks! > > -- Dan Gerlanc > Williams College > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Universität Hamburg Institut für Statistik und Ökonometrie Dipl.-Wi.-Math. Eik Vettorazzi Von-Melle-Park 5 20146 Hamburg Tel.: +49 40-42838-3540 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Plotting league tables/ caterpillar plots
Dear list, I was wondering if there is a function to plot league tables, sometimes also known as "caterpillar plots"? A league table is conceptually very similar to a box plot. One difference is that the inter-quartile ranges are not shown. If there isn't such a function a first attempt for a "selfmade" plot would be to tell boxplot not to plot boxes (sounds silly isn't it?). I've tried the option "boxwex=0" but the result is unsatisfactory. An example for a league table can be found in Marshall, Spiegelhalter [1998], Reliability of league tables of in vitro fertilisation clinics, BMJ1998;316:1701-1705, you may find it at http://bmj.bmjjournals.com Thanks in advance Eik Vettorazzi __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.