Re: [R] number of matches when using Match()

2006-04-24 Thread Brian Quinif
> > Speaking of standard errors, when correcting for heteroscedasticity,
> > how many matches do you use (this is the Var.cal option).  It seems to
> > me that it might make sense to use the same number of matches as
> > above, but that's just a guess...
>
> These are related but separate issues.  The number of matches is all
> about covariate balance (bias reduction).  And the Var.cal option is
> related to the heterogeneity of the causal effect.  It could be that
> the data is such that one needs to do 1-to-1 matching to get good
> covariate balance, but that the causal effect is homogeneous so
> Var.cal can be set to 0 etc.

Ok, but in my case, I think that the treatment effect *is*
hetergenous, and I even partition my sample based on a number of
characteristics and find very different effects for these subsamples. 
Given that, it seems that I certainly should not use Var.cal=0.

My question is how do I go about deciding what I should set Var.cal
equal to?  Should it be 1, or perhaps the number of matches I use for
the treatment effect?

Regards,

Brian

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] number of matches when using Match()

2006-04-22 Thread Jasjeet Singh Sekhon

> How do you go about deciding how many matches you will use?  With my
> data, my standard errors generally get smaller if I use more
> matches.

Generally, select the max number of matches that result in good or
acceptable balance (hence bounding bias due to the observed
confounders).  See the MatchBalance() function to get some measures of
balance.  And GenMatch() for automatically maximizing (observed)
covariate balance.

How to measure good balance is an open research question.  I will note
that the degree of covariate balance that is usually thought to be
acceptable in the applied literature isn't enough to get reliable
estimates in practice.  We can evaluate this by comparing an
observational estimate (with matching adjustment) with a known
experimental benchmark.  See:

http://sekhon.berkeley.edu/papers/GenMatch.pdf

> Speaking of standard errors, when correcting for heteroscedasticity,
> how many matches do you use (this is the Var.cal option).  It seems to
> me that it might make sense to use the same number of matches as
> above, but that's just a guess...

These are related but separate issues.  The number of matches is all
about covariate balance (bias reduction).  And the Var.cal option is
related to the heterogeneity of the causal effect.  It could be that
the data is such that one needs to do 1-to-1 matching to get good
covariate balance, but that the causal effect is homogeneous so
Var.cal can be set to 0 etc.

> One more question about Match()...
> I am calculating a number of SATT's that all have the same covariates
> (X's) and treatment variables (Tr's).  I would like to take advantage
> of the matching that I do the first time to then quickly calculate the
> SATT for various different Y's?  How can I do that?  It would save
> serious computational time.

The following code expands on your code and will estimate the mean
causal effect and the naive standard errors without a second call to
Match().  Doing this for the Abadie-Imbens SEs instead of the naive
SEs is left as an exercise (take the code from the Matching.R file of
the package).  In a future version of the package, I'll make a
separate function to make all of this transparent by using the
"predict()" setup.

###
library(Matching)

set.seed(30)
#make up some data
X <- matrix(rnorm(1000*5), ncol=5)
Tr <- c(rep(1,500),rep(0,500))
Y1 <- as.vector(rnorm(1000))
Y2 <- as.vector(rnorm(1000))

satt.Y1 <- Match(Y=Y1, X=X, Tr=Tr, M=1)
summary(satt.Y1, full=TRUE)

cat("** Estimate Y2 BY Calling Match() \n")
satt.Y2 <- Match(Y=Y2, X=X, Tr=Tr, M=1)
summary(satt.Y2, full=TRUE)

cat("** Estimate Without Calling Match() \n")
index.treated <- satt.Y1$index.treated
index.control <- satt.Y1$index.control
weights <- satt.Y1$weights
Y <- Y2

mest  <- sum((Y[index.treated]-Y[index.control])*weights)/sum(weights)
cat("estimate for Y2:", mest, "\n")

v1  <- Y[index.treated] - Y[index.control]
varest  <- sum( ((v1-mest)^2)*weights)/(sum(weights)*sum(weights))
se.naive  <- sqrt(varest)
cat("naive SE Y2:", se.naive, "\n")

###

Cheers,
JS.

===
Jasjeet S. Sekhon 
  
Associate Professor 
Survey Research Center  
UC Berkeley 

http://sekhon.berkeley.edu/
V: 510-642-9974  F: 617-507-5524
===



Brian Quinif writes:
 > To anyone who uses the Match() function in the Matching library...
 > 
 > How do you go about deciding how many matches you will use?  With my
 > data, my standard errors generally get smaller if I use more matches.
 > 
 > Speaking of standard errors, when correcting for heteroscedasticity,
 > how many matches do you use (this is the Var.cal option).  It seems to
 > me that it might make sense to use the same number of matches as
 > above, but that's just a guess...
 > 
 > One more question about Match()...
 > I am calculating a number of SATT's that all have the same covariates
 > (X's) and treatment variables (Tr's).  I would like to take advantage
 > of the matching that I do the first time to then quickly calculate the
 > SATT for various different Y's?  How can I do that?  It would save
 > serious computational time.
 > 
 > In case I'm not explaining myself well, in the example below, I would
 > like to calculate satt.Y2 without having to perform the matching all
 > over again, since with more data, the process can be very slow.
 > 
 > #make up some data
 > X <- matrix(rnorm(1000*5), ncol=5)
 > Tr <- c(rep(1,500),rep(0,500))
 > Y1 <- as.vector(rnorm(1000))
 > Y2 <- as.vector(rnorm(1000))
 > 
 > satt.Y1 <- Match(Y=Y1, X=X1, Tr=Tr, M=1)
 > satt.Y2 <- Match(Y=Y2, X=X1, Tr=Tr, M=1)
 > 
 > Thanks,
 > 
 > BQ
 > 
 >

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] number of matches when using Match()

2006-04-12 Thread Brian Quinif
To anyone who uses the Match() function in the Matching library...

How do you go about deciding how many matches you will use?  With my
data, my standard errors generally get smaller if I use more matches.

Speaking of standard errors, when correcting for heteroscedasticity,
how many matches do you use (this is the Var.cal option).  It seems to
me that it might make sense to use the same number of matches as
above, but that's just a guess...

One more question about Match()...
I am calculating a number of SATT's that all have the same covariates
(X's) and treatment variables (Tr's).  I would like to take advantage
of the matching that I do the first time to then quickly calculate the
SATT for various different Y's?  How can I do that?  It would save
serious computational time.

In case I'm not explaining myself well, in the example below, I would
like to calculate satt.Y2 without having to perform the matching all
over again, since with more data, the process can be very slow.

#make up some data
X <- matrix(rnorm(1000*5), ncol=5)
Tr <- c(rep(1,500),rep(0,500))
Y1 <- as.vector(rnorm(1000))
Y2 <- as.vector(rnorm(1000))

satt.Y1 <- Match(Y=Y1, X=X1, Tr=Tr, M=1)
satt.Y2 <- Match(Y=Y2, X=X1, Tr=Tr, M=1)

Thanks,

BQ

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html