Re: [R] FW: How to fit an linear model withou intercept

2007-08-29 Thread Eik Vettorazzi
Hi Mark,
as last comment you may also take a look at
?summary.lm
where you will notice, that R reports two different R squares depending 
on the presence or absence of an intercept term. For comparison issues 
you should ensure that you use the same mathematical object.
There was a thread about this (from where I took essentially Prof. 
Ripley reply for this answer) in Jan 2006, as you see in 
http://tolstoy.newcastle.edu.au/R/help/06/01/18923.html
hth.

Leeds, Mark (IED) schrieb:
> Eik : Today,  I've been reading Myers text , "classical and modern regression 
> with applications" to refresh my memory
> about regression because it's been a while since I looked at that material. 
> The usbtraction of the means from
> Both sides of the equation causing the intercept to be zero now makes more 
> sense because, in the simple regression
> case,
>
> b0 = y bar - b1 x bar and, by subtracting the means, y bar and x bar both 
> become zero, so b0 = zero.
>
> If you have any other comments,  they are very appreciated and always invited 
> but I think between what you showed and above,
> it's clearer now. I think I will go with  centering both  the left and right 
> hand sides to force the zero intercepts, estimate
> each model with the intercept ( which will hopefully numerically estimate the 
> intercept as very close to zero ) and then compare 
> the RSquareds of the two models. If you still see this as a problem, let me 
> know because I am totally open to listening to other 
> people's brains , especially good ones like yours.
>
>
>
> -Original Message-
> From: Eik Vettorazzi [mailto:[EMAIL PROTECTED] 
> Sent: Tuesday, August 28, 2007 8:33 AM
> To: Leeds, Mark (IED)
> Cc: R-help
> Subject: Re: FW: [R] How to fit an linear model withou intercept
>
> Hi Mark,
> I don't know wether you recived a sufficient reply or not, so here are my 
> comments to your problem.
> Supressing the constant term in a regression model will probably lead to a 
> violation of the classical assumptions for this model.
>  From the OLS normal equations (in matrix notation)
>  (1)  (X'X)b=X'y
> and the definition of the OLS residuals
>  (2)  e = y-Xb
> you get - by substituting y form (2) in (1)
>(X'X)b=(X'X)b+X'e
> and hence
>X'e =0.
> Without a constant term you cannot assure, that the ols residuals
> e=(y-Xb) will have zero mean, wich holds when involving a constant term, 
> since the first equation of X'e = 0 gives in this case sum(e)=0.
>
> For decomposing the TSS (y'y) into ESS (b'X'Xb) and RSS (e'e), which is 
> needed to compute R², you will need X'e=0, because then the cross-product 
> term b'X'e vanishes.
> Correct me if I'm wrong.
>
> Leeds, Mark (IED) schrieb:
>   
>> Park, Eik : Could you start from the bottom and read this when you 
>> have time. I really appreciate it.
>>
>> Basically, in a nutshell, my question is the "Hi John" part and I want 
>> to do my study correctly. Thanks a lot.
>>
>>
>>
>> -Original Message-
>> From: Leeds, Mark (IED)
>> Sent: Thursday, August 23, 2007 1:05 PM
>> To: 'John Sorkin'
>> Cc: '[EMAIL PROTECTED]'
>> Subject: RE: [R] How to fit an linear model withou intercept
>>
>>  Hi John : I'm from the R-list obviously and that was a nice example 
>> that I cut and pasted and learned from.  I'm Sorry to bother you but I 
>> had a non R question that I didn't want to pose to the R-list because 
>> I think It's been discussed a lot in the past but I never focused on 
>> the discussion.
>>
>> I need to do a study where I decide between two different univariate 
>> regressions models. The LHS is the same in both cases and it's not the 
>> goal of the study to build a prediction model but rather to see which 
>> RHS ( univariate ) explains the LHS better.
>> It's actually in a time series framework also but that's not relevant 
>> for my question. My question has 2 parts :
>>
>> 1) I was leaning towards using the R squared as the decision criteria 
>> ( I will be Regressing monthly and over a couple of years so I will 
>> have about 24 rsquareds. I have tons of data For one monthly 
>> regression so I don't have to just do one big regression over the 
>> whole time period ) but I noticed in your previous example that the 
>> model with intercept ( compared to the model forced to have zero 
>> intercept ) had a lower R^2 and a lower standard error at the same 
>> time ! So this asymmetry leads me to think that may

Re: [R] FW: How to fit an linear model withou intercept

2007-08-28 Thread Eik Vettorazzi
Hi Mark,
I don't know wether you recived a sufficient reply or not, so here are 
my comments to your problem.
Supressing the constant term in a regression model will probably lead to 
a violation of the classical assumptions for this model.
 From the OLS normal equations (in matrix notation)
 (1)  (X'X)b=X'y
and the definition of the OLS residuals
 (2)  e = y-Xb
you get - by substituting y form (2) in (1)
   (X'X)b=(X'X)b+X'e
and hence
   X'e =0.
Without a constant term you cannot assure, that the ols residuals 
e=(y-Xb) will have zero mean, wich holds when involving a constant term, 
since the first equation of X'e = 0 gives in this case sum(e)=0.

For decomposing the TSS (y'y) into ESS (b'X'Xb) and RSS (e'e), which is 
needed to compute R², you will need X'e=0, because then the 
cross-product term b'X'e vanishes.
Correct me if I'm wrong.

Leeds, Mark (IED) schrieb:
> Park, Eik : Could you start from the bottom and read this when you have
> time. I really appreciate it.
>
> Basically, in a nutshell, my question is the "Hi John" part and I want
> to do my study correctly. Thanks a lot.
>
>
>
> -Original Message-
> From: Leeds, Mark (IED) 
> Sent: Thursday, August 23, 2007 1:05 PM
> To: 'John Sorkin'
> Cc: '[EMAIL PROTECTED]'
> Subject: RE: [R] How to fit an linear model withou intercept
>
>  Hi John : I'm from the R-list obviously and that was a nice example
> that I cut and pasted and learned from.  I'm Sorry to bother you but I
> had a non R question that I didn't want to pose to the R-list because I
> think It's been discussed a lot in the past but I never focused on the
> discussion. 
>
> I need to do a study where I decide between two different univariate
> regressions models. The LHS is the same in both cases and it's not the
> goal of the study to build a prediction model but rather to see which
> RHS ( univariate ) explains the LHS better. 
> It's actually in a time series framework also but that's not relevant
> for my question. My question has 2 parts : 
>
> 1) I was leaning towards using the R squared as the decision criteria (
> I will be Regressing monthly and over a couple of years so I will have
> about 24 rsquareds. I have tons of data For one monthly regression so I
> don't have to just do one big regression over the whole time period )
> but I noticed in your previous example that the model with intercept (
> compared to the model forced to have zero intercept ) had a lower R^2
> and a lower standard error at the same time ! So this asymmetry
> leads me to think that maybe I should be using standard error rather
> than Rsquared as my criteria ?
>
> 2) This is possibly related to 1 : Isn't there a problem with using the
> Rsquared for anything when you force no intercept ?
> I think I remember seeing discussions about this on the list. That's why
> I was thinking of including the intercept.
> ( intercept in my problem really has no meaning but I wanted to retain
> the validity of the Rsquared ) But, now that I see your email, maybe I
> should be still including an intercept and using standard error as the
> criteria.
> Or maybe when you include an intercept ( in both cases ) you don't get
> this asymmetry between Rsquared and standrd error. 
> I was surprised to see the asymmetry  but maybe it happens because one
> is comparing model with intercept to a model without intercept and no
> intercept probably renders the rsquared critieria meaningless in the
> latter.
>
> Thanks for any insight you can provide. I can also center and go without
> intercept because it sounded like you DEFINITELY preferred that Method
> over just not including an intercept at all.  I was thinking of sending
> this question to the R-list but I didn't want to get hammered because I
> know that this is not a new discussion. Thanks so much.
>
>
>   
> Mark
>
> P.S : How the heck did you get an MD and a Ph.D ? Unbelievable. Did you
> do them at the same time ? 
>
>
>
>
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of John Sorkin
> Sent: Thursday, August 23, 2007 9:29 AM
> To: David Barron; Michal Kneifl; r-help
> Subject: Re: [R] How to fit an linear model withou intercept
>
> Michael,
> Assuming you want a model with an intercept of zero, I think we need to
> ask you why you want an intercept of zero. When a "normal" regression
> indicates a non-zero intercet, forcing the regression line to have a
> zero intercept changes the meaning of the regression coefficients. If
> for some reason you want to have a zero intercept, but do not want to
> change the meaning of the regression coefficeints, i.e. you still what
> to minimize the sum of the square deviations from the BLUE (Best
> Leastsquares Unibiased Estimator) of the regression, you can center your
> dependent and indepdent variables re-run the regression. Centering means
> subtracting the mean of each variable from the variable before
> performing the regression. When you do this, the int

Re: [R] Help with vector gymnastics

2007-08-23 Thread Eik Vettorazzi
try

5*which(tf)[cumsum(tf)]

Gladwin, Philip schrieb:
> Hello,
>
> What is the best way of solving this problem?
>
> answer <- ifelse(tf=TRUE, i * 5, previous answer)
> where as an initial condition 
> tf[1] <- TRUE
>
>
> For example if,
> tf <- c(T,F,F,F,T,T,F)
> over i = 1 to 7
> then the output of the function will be
> answer = 5 5 5 5 25 30 30 
>
> Thank you.
>
> Phil,
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>   


-- 
Eik Vettorazzi
Institut für Medizinische Biometrie und Epidemiologie
Universitätsklinikum Hamburg-Eppendorf

Martinistr. 52
22046 Hamburg

T ++49/40/42803-8243
F ++49/40/42803-7790

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] symbolic matrix elements...

2006-09-18 Thread Eik Vettorazzi
test=matrix(c( expression(x^3-5*x+4), expression(log(x^2-4*x
works.

btw. you recieved an error because D expects an expression and you offered  
a list
> class(test[1])
[1] "list"
to get the error relating to the misuse of the tilde operator you have to  
prompt the "correct" extractor "[["
f<-test[[1]]
D(f,"x")


Am Mon, 18 Sep 2006 18:30:57 +0200 schrieb Evan Cooch  
<[EMAIL PROTECTED]>:

> Normally, I do symbolics in Maple, or Mathematica, but I'm trying to
> write a simple script for students to handle some *very* simple
> calculations (for other purposes) with matrix or vector elements, where
> the elements are coded symbolically. What I've tried with *partial"
> success is use of the tilde (~) operator. So, for example, consider a
> simple vector:
>
> test=matrix(c(~ x^3-5*x+4, ~log(x^2-4*x)))
>
> Now, when I look at test, I see
>
>  > test
>  [,1]
> [1,] Expression
> [2,] Expression
>
> Fine. When I try to extract one of the vector elements, I see (for  
> example)
>
>  > test[1]
> [[1]]
> ~x^3 - 5 * x + 4
>
>
> Fine - but now I'm trying to figure out how to use the extracted matrix
> element for anything else. For example, using D for simple symbolic
> derivatives
>
> f <- test[1];
> D(f,"x")
>
> should *in theory* work, but I get the following:
>
>  > D(f,"x");
> [1] NA
>
> But, if I try
>
> f <- expression(x^3-5*x+4);
> D(f,"x");
>
> works fine.
>
> So, even though it looks as if each element of test is coded as an
> expression, it seems as though it is somehow a different type of
> expression than if I code it explicitly as an expression. I'm *guessing*
> it has to do with the tilde operator not assigning the formula to
> anything, but I'm not sure.
>
> Suggestions? Pointers to the obvious?
>
> Thanks!
>



-- 


Universität Hamburg
Institut für Statistik und Ökonometrie
Dipl.-Wi.-Math. Eik Vettorazzi
Von-Melle-Park 5
20146 Hamburg

Tel.: +49 40-42838-3540

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] negatively skewed data; reflecting

2006-08-23 Thread Eik Vettorazzi
a simple reflection (on the y-axis) of x is -x, but you have to ensure  
that there are only nonnegative numbers if you want to use the log  
transformation. So you should reflect on a postive number z greater than  
abs(min(x)), if min(x)<0. This is done by z-x.
Why don't you simply shift your data by this amount z or use a  
box-cox-transformation at all?

Am Wed, 23 Aug 2006 14:08:08 +0200 schrieb <[EMAIL PROTECTED]>:

> Hi,
>
> This problem may be very easy, but I can't think of how to do it.  I  
> have constructed histograms of various variables in my dataset.  Some of  
> them are negatively skewed, and hence need data transformations  
> applied.  I know that you first need to reflect the negatively skewed  
> data and then apply another transformation such as log, square root etc  
> to bring it towards normailty. How is it that I reflect data in R?  I'm  
> sorry if this seems a very simple task, I think it involves going back  
> to Maths GCSE and relearning reflection, rotation, translation etc!  I  
> have searched the internet, but cannot come up with anything useful on  
> how to reflect data.
>
>> hist(Lsoc)  #how do I reflect Lsoc in R?
>
> I am grateful for any help regarding this matter, it is just a very  
> small part of my analysis and doesn't seem worth agonising hours over.   
> I will probably kick myself when someone tells me the answer!
>
> Thank you very much,
>
> Zoe
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide  
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Vectorizing a "for" loop

2006-08-03 Thread Eik Vettorazzi
res<-outer(rows,columns,FUN=function(x,y) abs(x-y))

will help you.

Am Thu, 03 Aug 2006 16:10:46 +0200 schrieb Daniel Gerlanc  
<[EMAIL PROTECTED]>:

> Hello all,
>
> Consider the following problem:
>
> There are two vectors:
>
> rows <- c(1, 2, 3, 4, 5)
> columns <- c(10, 11, 12, 13, 14)
>
> I want to create a matrix with dimensions length(rows) x length(columns):
>
> res <- matrix(nrow = length(rows), ncol = length(columns))
>
> If "i" and "j" are the row and column indexes respectively, the values
> of the cells are abs(rows[i] - columns[j]).  The resultant matrix
> follows:
>
>  [,1] [,2] [,3] [,4] [,5]
> [1,]9   10   11   12   13
> [2,]8910   11   12
> [3,]78  9   10   11
> [4,]67  8 9   10
> [5,]56  7 89
>
> This matrix may be generated by using a simple "for" loop:
>
> for(i in 1:length(rows)){
>   for(j in 1:length(columns)){
> res[i,j] <- abs(rows[i] - columns[j])
>   }
> }
>
> Is there a quicker, vector-based approach for doing this or a function
> included in the recommended packages that does this?
>
> Thanks!
>
> -- Dan Gerlanc
> Williams College
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide  
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 


Universität Hamburg
Institut für Statistik und Ökonometrie
Dipl.-Wi.-Math. Eik Vettorazzi
Von-Melle-Park 5
20146 Hamburg

Tel.: +49 40-42838-3540

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Plotting league tables/ caterpillar plots

2006-07-24 Thread Eik Vettorazzi
Dear list,
I was wondering if there is a function to plot league tables, sometimes  
also known as "caterpillar plots"?
A league table is conceptually very similar to a box plot. One difference  
is that the inter-quartile ranges are not shown. If there isn't such a  
function a first attempt for a "selfmade" plot would be to tell boxplot  
not to plot boxes (sounds silly isn't it?). I've tried the option  
"boxwex=0" but the result is unsatisfactory.

An example for a league table can be found in Marshall, Spiegelhalter  
[1998], Reliability of league tables of in vitro fertilisation clinics,  
BMJ1998;316:1701-1705, you may find it at http://bmj.bmjjournals.com

Thanks in advance

Eik Vettorazzi

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.