[GitHub] spark issue #22659: [SPARK-25623][TEST] Reduce test time of LogisticRegressi...
Github user shahidki31 commented on the issue: https://github.com/apache/spark/pull/22659 In the test "binary logistic regression with intercept with ElasticNet regularization", taking around 30sec to run. But we can reduce the time to 15 sec by reducing the iteration. ![image](https://user-images.githubusercontent.com/23054875/46590813-0a54b080-cad4-11e8-8d27-9b049fc4537c.png) model1 converges after 100 iteration, ![image](https://user-images.githubusercontent.com/23054875/46590826-19d3f980-cad4-11e8-9c81-4c42ac5559b8.png) model2 converges after 20 iterations. So, if we make maxIter of model1 and model2 as 120 and 30 respectively, we can reduce the time to ~15 sec. In the test "multinomial logistic regression without intercept with elasticnet regularization", taking around 30 sec to run. This also can be reduced to 15 sec by reducing number of iteration. ![image](https://user-images.githubusercontent.com/23054875/46590808-032da280-cad4-11e8-8b8f-9e70632d.png) model1 converges after 50 iteration. ![image](https://user-images.githubusercontent.com/23054875/46590819-10e32800-cad4-11e8-9ded-b29e68dfd0ff.png) model2 converges after 30 iteration. So, if we make maxIter of model1 and model2 as 75 and 50 respectively, we can reduce the computation time less than 15sec --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22659: [SPARK-25623][TEST] Reduce test time of LogisticRegressi...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22659 **[Test build #97094 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97094/testReport)** for PR 22659 at commit [`c28fd05`](https://github.com/apache/spark/commit/c28fd05f259a681a74ab34d2be1818c205bf29a9). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22659: [SPARK-25623][TEST] Reduce test time of LogisticRegressi...
Github user shahidki31 commented on the issue: https://github.com/apache/spark/pull/22659 In the test, "multinomial logistic regression with intercept with elasticnet regularization" in the "LogisticRegressionSuite", taking around 1 minute to train 2 logistic regression model. However after analyzing the training cost over iteration, we can reduce the computation time by 50%. Training cost vs iteration for model 1 ![image](https://user-images.githubusercontent.com/23054875/46590546-c496e880-cad1-11e8-8539-5bc9853c33ca.png) So, model1 is converging after iteration 200. Training cost vs iteration for model 2: image ![image](https://user-images.githubusercontent.com/23054875/46590551-ca8cc980-cad1-11e8-8e83-24ad220e1618.png) After around 50 iteration, model2 is converging. So, if we give maximum iteration for model1 and model2 as 220 and 90 respectively, we can reduce the computation time by half. Computation time in local setup : Before change: ~54 sec After change: ~35 sec --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22659: [SPARK-25623][TEST] Reduce test time of LogisticRegressi...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22659 **[Test build #97093 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97093/testReport)** for PR 22659 at commit [`3d9673e`](https://github.com/apache/spark/commit/3d9673e4014872b3b0583b86e134bcbdd27f6e39). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22659: [SPARK-25623][TEST] Reduce test time of LogisticRegressi...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22659 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22659: [SPARK-25623][TEST] Reduce test time of LogisticRegressi...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22659 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97087/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22659: [SPARK-25623][TEST] Reduce test time of LogisticRegressi...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22659 **[Test build #97087 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97087/testReport)** for PR 22659 at commit [`2040ada`](https://github.com/apache/spark/commit/2040ada029bc8f8b894b724706acb0450c2874b5). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22659: [SPARK-25623][TEST] Reduce test time of LogisticRegressi...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22659 **[Test build #97087 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97087/testReport)** for PR 22659 at commit [`2040ada`](https://github.com/apache/spark/commit/2040ada029bc8f8b894b724706acb0450c2874b5). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22659: [SPARK-25623][TEST] Reduce test time of LogisticRegressi...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22659 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22659: [SPARK-25623][TEST] Reduce test time of LogisticRegressi...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22659 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22659: [SPARK-25623][TEST] Reduce test time of LogisticRegressi...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22659 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #22659: [SPARK-25623][TEST] Reduce test time of LogisticRegressi...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22659 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org