On 2013-04-04 02:11, Cecilia Carmo wrote:
Thank you all. I'm very happy with this solution. Just two questions: I use mutate() with package plyr and it gaves me a error message, is it a new function and my package may be old? Is there any extractor for the R-squared?Thanks again, Cecília Carmo
According to the plyr NEWS file, mutate was introduced in Version 1.3 (2010-12-28). I would hope that your version is newer than that. You should tell us what the error message is. Anyway, you can always use R's within() function instead; or use transform() as Jean suggested. Peter Ehlers
________________________________________ De: Peter Ehlers [ehl...@ucalgary.ca] Enviado: quarta-feira, 3 de Abril de 2013 19:01 Para: Adams, Jean Cc: Cecilia Carmo; r-help@r-project.org Assunto: Re: [R] linear model coefficients by year and industry, fitted values, residuals, panel data A few minor improvements to Jean's post suggested inline below. On 2013-04-03 05:41, Adams, Jean wrote:Cecilia, Thanks for providing a reproducible example. Excellent. You could use the ddply() function in the plyr package to fit the model for each industry and year, keep the coefficients, and then estimate the fitted and residual values. Jean library(plyr) coef <- ddply(final3, .(industry, year), function(dat) lm(Y ~ X + Z, data=dat)$coef) names(coef) <- c("industry", "year", "b0", "b1", "b2") final4 <- merge(final3, coef) newdata1 <- transform(final4, Yhat = b0 + b1*X + b2*Z) newdata2 <- transform(newdata1, residual = Y-Yhat) plot(as.factor(newdata2$firm), newdata2$residual)Suggestion 1: Use the extractor function coef() and also avoid using the name of an R function as a variable name: Coef <- ddply(...., function(dat) coef(lm(....))) Suggestion 2: Use plyr's mutate() to do both transforms at once: newdata <- mutate(final4, Yhat = b0 + b1*X + b2*Z, residual = Y-Yhat) [Or you could use within(), but I now find mutate handier, mainly because it doesn't 'reverse' the order of the new variables.] Suggestion 3: Use the 'data=' argument in the plot: boxplot(residual ~ firm, data = newdata) Peter EhlersOn Wed, Apr 3, 2013 at 3:38 AM, Cecilia Carmo <cecilia.ca...@ua.pt> wrote:Hi R-helpers, My real data is a panel (unbalanced and with gaps in years) of thousands of firms, by year and industry, and with financial information (variables X, Y, Z, for example), the number of firms by year and industry is not always equal, the number of years by industry is not always equal. #reproducible example firm1<-sort(rep(1:10,5),decreasing=F) year1<-rep(2000:2004,10) industry1<-rep(20,50) X<-rnorm(50) Y<-rnorm(50) Z<-rnorm(50) data1<-data.frame(firm1,year1,industry1,X,Y,Z) data1 colnames(data1)<-c("firm","year","industry","X","Y","Z") firm2<-sort(rep(11:15,3),decreasing=F) year2<-rep(2001:2003,5) industry2<-rep(30,15) X<-rnorm(15) Y<-rnorm(15) Z<-rnorm(15) data2<-data.frame(firm2,year2,industry2,X,Y,Z) data2 colnames(data2)<-c("firm","year","industry","X","Y","Z") firm3<-sort(rep(16:20,4),decreasing=F) year3<-rep(2001:2004,5) industry3<-rep(40,20) X<-rnorm(20) Y<-rnorm(20) Z<-rnorm(20) data3<-data.frame(firm3,year3,industry3,X,Y,Z) data3 colnames(data3)<-c("firm","year","industry","X","Y","Z") final1<-rbind(data1,data2) final2<-rbind(final1,data3) final2 final3<-final2[order(final2$industry,final2$year),] final3 I need to estimate a linear model Y = b0 + b1X + b2Z by industry and year, to obtain the estimates of b0, b1 and b2 by industry and year (for example I need to have de b0 for industry 20 and year 2000, for industry 20 and year 2001...). Then I need to calculate the fitted values and the residuals by firm so I need to keep b0, b1 and b2 in a way that I could do something like newdata1<-transform(final3,Y'=b0+b1.X+b2.Z) newdata2<-transform(newdata1,residual=Y-Y') or another way to keep Y' and the residuals in a dataframe with the columns firm and year. Until now I have been doing this in very hard way and because I need to do it several times, I need your help to get an easier way. Thank you, Cecília Carmo Universidade de Aveiro Portugal>
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.