[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-05-03 Thread daniel-siegmann-aol
Github user daniel-siegmann-aol commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-216540912 Will do. Thank you for your help! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-05-02 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-216393823 Currently, we're working on 2.0 release. Please ping us again after the release. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-05-02 Thread daniel-siegmann-aol
Github user daniel-siegmann-aol commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-216277142 Now that the tests are passing, what's the next step? Is there anything else needed from me at this time? (Please forgive my ignorance, this is the first

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215751772 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215751775 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-29 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215751317 **[Test build #57335 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57335/consoleFull)** for PR 12761 at commit

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-29 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215712017 **[Test build #57335 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57335/consoleFull)** for PR 12761 at commit

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-29 Thread MLnick
Github user MLnick commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215710885 jenkins retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-29 Thread daniel-siegmann-aol
Github user daniel-siegmann-aol commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215709655 Good news is the previous failed test now passes. Bad news is two new tests fail: MemorySinkSuite (SQL streaming) and GraphSuite (GraphX). Neither of these

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215592564 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215592449 **[Test build #57283 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57283/consoleFull)** for PR 12761 at commit

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215592560 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215575102 **[Test build #57283 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57283/consoleFull)** for PR 12761 at commit

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215573982 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215573910 Jenkins, test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215573882 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215573656 Jenkins, please test it again. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread daniel-siegmann-aol
Github user daniel-siegmann-aol commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215572538 Thanks. I'm pretty confident this does what it's suppose to do, my main concern is to make sure performance doesn't degrade for anything else. The

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215570785 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215570788 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215570576 **[Test build #57272 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57272/consoleFull)** for PR 12761 at commit

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215564655 Seems to be very promising. Since 2.0 window will be closed soon, it's unlikely to get into 2.0. Let's target 2.1 --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread daniel-siegmann-aol
Github user daniel-siegmann-aol commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215564208 Note those results above are the results from my production data. For comparison, I'm told by one of our data scientists the training can be done locally

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread daniel-siegmann-aol
Github user daniel-siegmann-aol commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215561740 I'll give the results of my own training flow too. Testing was done on EMR 4.4.0 with Spark 1.6.0. The cluster was configured with six r3.8xlarge nodes:

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215561453 You may use some fake data to demonstrate how this PR improves. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread daniel-siegmann-aol
Github user daniel-siegmann-aol commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215557197 Is there some Spark benchmark you want me to run? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215553815 Can you also post the benchmark result with/without this PR for very sparse features? Thanks. --- If your project is set up for it, you can reply to this email and

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215545319 **[Test build #57272 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57272/consoleFull)** for PR 12761 at commit

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215539469 Make it build first, and then we can start to review the code. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215539228 You need to manually add it into MiMa exclude. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread daniel-siegmann-aol
Github user daniel-siegmann-aol commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215536649 Binary compatibility check failed: method this()Unit in class org.apache.spark.mllib.stat.MultivariateOnlineSummarizer does not have a correspondent in

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215534202 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215534169 **[Test build #57269 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57269/consoleFull)** for PR 12761 at commit

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215534206 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread SparkQA
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215532308 **[Test build #57269 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57269/consoleFull)** for PR 12761 at commit

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread dbtsai
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215532091 Jenkins, ok to test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread daniel-siegmann-aol
Github user daniel-siegmann-aol commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215531118 I should mention Nick Pentreath pointed me in the right direction on the dev list. So, credit to him for the help! --- If your project is set up for it,

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/12761#issuecomment-215526701 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-14464] [MLLIB] Better support for logis...

2016-04-28 Thread daniel-siegmann-aol
GitHub user daniel-siegmann-aol opened a pull request: https://github.com/apache/spark/pull/12761 [SPARK-14464] [MLLIB] Better support for logistic regression when features are sparse ## What changes were proposed in this pull request? Where aggregations were being done