[R] Robust vce for heckman estimators
When using function heckit() from package âsampleSelectionâ, is there anyway to make t-tests for the coefficients using robust covariance matrix estimator? By ârobustâ I mean something like if a had an object âlmâ called âregâ and then used: coeftest(reg, vcov = vcovHC(reg)). Iâm asking this because in Stata we could use function heckman and then use vce option ârobustâ. We could do the same for cluster. In a more general way, is there anyway to use another covariance matrix to make t-test (e.g. linear hypothesis) for heckit (selection) models? Thanks, Mateus Rabello [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Create factor variable by groups
Hi, suppose that I have the following data.frame: cnae4 cnpj 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 Y 24996 10020470 1 1 2 12 16 21 17 51 43 19 183 24996 10020470 69 91 79 92 91 77 90 96 98 108 891 36145 10020470 0 0 0 0 2 83 112 97 91 144 529 4 1002 5 20 60 0 0 0 0 5 20 1000 1110 I would like to create a new variable X that indicates which line, within the cnpj variable, has the highest value Y. For instance, within the cnpj = 10020470, the second line has the largest value Y (891). For cnpj = 1002 is trivial (1110). Then, my new data.frame would become: cnae4 cnpj 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 Y X 24996 10020470 1 1 2 12 16 21 17 51 43 19 183 FALSE 24996 10020470 69 91 79 92 91 77 90 96 98 108 891 TRUE 36145 10020470 0 0 0 0 2 83 112 97 91 144 529 FALSE 4 1002 5 20 60 0 0 0 0 5 20 1000 1110 TRUE Notice that for every value of the variable cnpj, only one line will have X = TRUE. Then, I would like to create a variable Z that is the sum of variable Y, also by variable cnpj. Thus, if cnpj = 10020470, Z = 183 + 891 +529 and for cnpj = 1002, Z = 120. These sums can easily be done with tapply or aggregate but those would eliminate line with equal cnpj and I donât want that. I would like to achieve a data.frame like the following: cnae4 cnpj 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 Y X Z 24996 10020470 1 1 2 12 16 21 17 51 43 19 183 FALSE 1603 24996 10020470 69 91 79 92 91 77 90 96 98 108 891 TRUE 1603 36145 10020470 0 0 0 0 2 83 112 97 91 144 529 FALSE 1603 4 1002 5 20 60 0 0 0 0 5 20 1000 1110 TRUE 1110 In the end I will eliminate all lines with X = FALSE. Thank you and sorry for the long question. Mateus Rabello [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Conditional Correlation
Hi, How can I accomplish this in R. Example: I have the following data.frame: data - data.frame(x=c(1,2,3,4,5,6,5,3,7,1,0,4,8),y=c(1,2,1,2,2,2,1,1,1,2,2,2,2),z=c(5,8,4,3,4,1,6,3,3,6,3,5,7)) Supposing that data$y is a factor, I would like to find the Spearman correlation between data$x and data$z indexing it by data$y. To be more specific, I want to find two correlations: between x and z with y==1 and the same correlation with x and z where y==2. Something like: cor(data$x[data$y==1],data$z[data$y==1],method= spearman) and cor(data$x[data$y==2],data$z[data$y==2],method= spearman), but without having to write all the values for data$y and use cor more than once. I hope I made myself clear. Thanks Mateus [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.