On 2013-04-04 02:11, Cecilia Carmo wrote:
Thank you all. I'm very happy with this solution. Just two questions:
I use mutate() with package plyr and it gaves me a error message, is it a new 
function and my package may be old?
Is there any extractor for the R-squared?

Thanks again,

Cecília Carmo

According to the plyr NEWS file, mutate was introduced in
Version 1.3 (2010-12-28). I would hope that your version is
newer than that. You should tell us what the error message is.
Anyway, you can always use R's within() function instead;
or use transform() as Jean suggested.

Peter Ehlers


________________________________________
De: Peter Ehlers [ehl...@ucalgary.ca]
Enviado: quarta-feira, 3 de Abril de 2013 19:01
Para: Adams, Jean
Cc: Cecilia Carmo; r-help@r-project.org
Assunto: Re: [R] linear model coefficients by year and industry, fitted values, 
residuals, panel data

A few minor improvements to Jean's post suggested inline below.

On 2013-04-03 05:41, Adams, Jean wrote:
Cecilia,

Thanks for providing a reproducible example.  Excellent.

You could use the ddply() function in the plyr package to fit the model for
each industry and year, keep the coefficients, and then estimate the fitted
and residual values.

Jean

library(plyr)
coef <- ddply(final3, .(industry, year), function(dat) lm(Y ~ X + Z,
data=dat)$coef)
names(coef) <- c("industry", "year", "b0", "b1", "b2")
final4 <- merge(final3, coef)
newdata1 <- transform(final4, Yhat = b0 + b1*X + b2*Z)
newdata2 <- transform(newdata1, residual = Y-Yhat)
plot(as.factor(newdata2$firm), newdata2$residual)

Suggestion 1:
Use the extractor function coef() and also avoid using the name
of an R function as a variable name:

   Coef <- ddply(...., function(dat) coef(lm(....)))

Suggestion 2:
Use plyr's mutate() to do both transforms at once:

   newdata <- mutate(final4,
                     Yhat = b0 + b1*X + b2*Z,
                     residual = Y-Yhat)

[Or you could use within(), but I now find mutate handier, mainly
because it doesn't 'reverse' the order of the new variables.]

Suggestion 3:
Use the 'data=' argument in the plot:

   boxplot(residual ~ firm, data = newdata)

Peter Ehlers


On Wed, Apr 3, 2013 at 3:38 AM, Cecilia Carmo <cecilia.ca...@ua.pt> wrote:

Hi R-helpers,



My real data is a panel (unbalanced and with gaps in years) of thousands
of firms, by year and industry, and with financial information (variables
X, Y, Z, for example), the number of firms by year and industry is not
always equal, the number of years by industry is not always equal.



#reproducible example
firm1<-sort(rep(1:10,5),decreasing=F)
year1<-rep(2000:2004,10)
industry1<-rep(20,50)
X<-rnorm(50)
Y<-rnorm(50)
Z<-rnorm(50)
data1<-data.frame(firm1,year1,industry1,X,Y,Z)
data1
colnames(data1)<-c("firm","year","industry","X","Y","Z")



firm2<-sort(rep(11:15,3),decreasing=F)
year2<-rep(2001:2003,5)
industry2<-rep(30,15)
X<-rnorm(15)
Y<-rnorm(15)
Z<-rnorm(15)
data2<-data.frame(firm2,year2,industry2,X,Y,Z)
data2
colnames(data2)<-c("firm","year","industry","X","Y","Z")

firm3<-sort(rep(16:20,4),decreasing=F)
year3<-rep(2001:2004,5)
industry3<-rep(40,20)
X<-rnorm(20)
Y<-rnorm(20)
Z<-rnorm(20)
data3<-data.frame(firm3,year3,industry3,X,Y,Z)
data3
colnames(data3)<-c("firm","year","industry","X","Y","Z")



final1<-rbind(data1,data2)
final2<-rbind(final1,data3)
final2
final3<-final2[order(final2$industry,final2$year),]
final3



I need to estimate a linear model Y = b0 + b1X + b2Z by industry and year,
to obtain the estimates of b0, b1 and b2 by industry and year (for example
I need to have de b0 for industry 20 and year 2000, for industry 20 and
year 2001...). Then I need to calculate the fitted values and the residuals
by firm so I need to keep b0, b1 and b2 in a way that I could do something
like
newdata1<-transform(final3,Y'=b0+b1.X+b2.Z)
newdata2<-transform(newdata1,residual=Y-Y')
or another way to keep Y' and the residuals in a dataframe with the
columns firm and year.



Until now I have been doing this in very hard way and because I need to do
it several times, I need your help to get an easier way.



Thank you,



Cecília Carmo

Universidade de Aveiro

Portugal
>

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to