Hi Alessandro, MLlib v1.1 supports variance for regression and gini impurity and entropy for classification. http://spark.apache.org/docs/latest/mllib-decision-tree.html
If the information gain calculation can be performed by distributed aggregation then it might be possible to plug it into the existing implementation. We want to perform such calculations (for e.g. median) for the gradient boosting models (coming up in the 1.2 release) using absolute error and deviance as loss functions but I don't think anyone is planning to work on it yet. :-) -Manish On Mon, Nov 17, 2014 at 11:11 AM, Alessandro Baretta <alexbare...@gmail.com> wrote: > I see that, as of v. 1.1, MLLib supports regression and classification tree > models. I assume this means that it uses a squared-error loss function for > the first and logistic cost function for the second. I don't see support > for quantile regression via an absolute error cost function. Or am I > missing something? > > If, as it seems, this is missing, how do you recommend to implement it? > > Alex >