[jira] [Commented] (SPARK-17870) ML/MLLIB: ChiSquareSelector based on Statistics.chiSqTest(RDD) is wrong

2016-10-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567244#comment-15567244 ] Apache Spark commented on SPARK-17870: -- User 'mpjlu' has created a pull request for this issue:

[jira] [Commented] (SPARK-17870) ML/MLLIB: ChiSquareSelector based on Statistics.chiSqTest(RDD) is wrong

2016-10-11 Thread Peng Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15567170#comment-15567170 ] Peng Meng commented on SPARK-17870: --- hi [~avulanov], the question here is not use raw chi2 scores or

[jira] [Commented] (SPARK-17870) ML/MLLIB: ChiSquareSelector based on Statistics.chiSqTest(RDD) is wrong

2016-10-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566586#comment-15566586 ] Sean Owen commented on SPARK-17870: --- If the degrees of freedom are the same across the tests, then

[jira] [Commented] (SPARK-17870) ML/MLLIB: ChiSquareSelector based on Statistics.chiSqTest(RDD) is wrong

2016-10-11 Thread Alexander Ulanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566467#comment-15566467 ] Alexander Ulanov commented on SPARK-17870: --

[jira] [Commented] (SPARK-17870) ML/MLLIB: ChiSquareSelector based on Statistics.chiSqTest(RDD) is wrong

2016-10-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15566284#comment-15566284 ] Sean Owen commented on SPARK-17870: --- OK I get it, they're doing different things really. The scikit