Thanks for the reference! Many tests are not designed for big data: http://magazine.amstat.org/blog/2010/09/01/statrevolution/ . So we need to understand which tests are proper. Feel free to create a JIRA and let's move our discussion there. -Xiangrui
On Fri, Aug 22, 2014 at 8:44 PM, guxiaobo1982 <guxiaobo1...@qq.com> wrote: > Hi Xiangrui, > > You can refer to <<An Introduction to Statistical Learning with Applications > in R>>, there are many stander hypothesis test to do regarding to linear > regression and logistic regression, they should be implement in the fist > order, then we will list some other testes, which are also important when > using logistic regression to build score cards. > > Xiaobo Gu > > > ------------------ Original ------------------ > From: "Xiangrui Meng";<men...@gmail.com>; > Send time: Wednesday, Aug 20, 2014 2:18 PM > To: ""<guxiaobo1...@qq.com>; > Cc: "user@spark.apache.org"<user@spark.apache.org>; > Subject: Re: What about implementing various hypothesis test for > LogisticRegression in MLlib > > We implemented chi-squared tests in v1.1: > https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/stat/Statistics.scala#L166 > and we will add more after v1.1. Feedback on which tests should come > first would be greatly appreciated. -Xiangrui > > On Tue, Aug 19, 2014 at 9:50 PM, guxiaobo1982 <guxiaobo1...@qq.com> wrote: >> Hi, >> >> From the documentation I think only the model fitting part is implement, >> what about the various hypothesis test and performance indexes used to >> evaluate the model fit? >> >> Regards, >> >> Xiaobo Gu > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org