Causal forest are a very nice work. However, they deal with causal inference, rather than prediction. Hence, I am not really sure how we could implement them in the API of scikit-learn. Do you have a suggestion?
Cheers, Gaël On Fri, May 24, 2019 at 05:21:50PM -0400, Randy Ellis wrote: > Would this be difficult for a moderate user to implement in sklearn by > modifying the existing code base? > Estimation and Inference of Heterogeneous Treatment Effects using Random > Forests > 342 citations in less than a year (Google Scholar): https:// > amstat.tandfonline.com/doi/full/10.1080/01621459.2017.1319839 > "In this article, we develop a nonparametric causal forest for estimating > heterogeneous treatment effects that extends Breiman’s widely used random > forest algorithm. In the potential outcomes framework with unconfoundedness, > we > show that causal forests are pointwise consistent for the true treatment > effect > and have an asymptotically Gaussian and centered sampling distribution. We > also > discuss a practical method for constructing asymptotic confidence intervals > for > the true treatment effect that are centered at the causal forest estimates. > Our > theoretical results rely on a generic Gaussian theory for a large family of > random forest algorithms. To our knowledge, this is the first set of results > that allows any type of random forest, including classification and regression > forests, to be used for provably valid statistical inference. In experiments, > we find causal forests to be substantially more powerful than classical > methods > based on nearest-neighbor matching, especially in the presence of irrelevant > covariates." -- Gael Varoquaux Senior Researcher, INRIA http://gael-varoquaux.info http://twitter.com/GaelVaroquaux _______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn