[GitHub] spark issue #22659: [SPARK-25623][TEST] Reduce test time of LogisticRegressi...

2018-10-07 Thread shahidki31
Github user shahidki31 commented on the issue:

https://github.com/apache/spark/pull/22659
  
In the test "binary logistic regression with intercept with ElasticNet 
regularization", taking around 30sec to run. But we can reduce the time to 15 
sec by reducing the iteration.


![image](https://user-images.githubusercontent.com/23054875/46590813-0a54b080-cad4-11e8-8d27-9b049fc4537c.png)
model1 converges after 100 iteration,

![image](https://user-images.githubusercontent.com/23054875/46590826-19d3f980-cad4-11e8-9c81-4c42ac5559b8.png)
model2 converges after 20 iterations. 
So, if we make maxIter of model1 and model2 as 120 and 30 respectively, we 
can reduce the time to ~15 sec.

In the test "multinomial logistic regression without intercept with 
elasticnet regularization", taking around 30 sec to run. This also can be 
reduced to 15 sec by reducing number of iteration.

![image](https://user-images.githubusercontent.com/23054875/46590808-032da280-cad4-11e8-8b8f-9e70632d.png)
model1 converges after 50 iteration.

![image](https://user-images.githubusercontent.com/23054875/46590819-10e32800-cad4-11e8-9ded-b29e68dfd0ff.png)
model2 converges after 30 iteration.
So, if we make maxIter of model1 and model2 as 75 and 50 respectively, we 
can reduce the computation time less than 15sec




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22659: [SPARK-25623][TEST] Reduce test time of LogisticRegressi...

2018-10-07 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22659
  
**[Test build #97094 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97094/testReport)**
 for PR 22659 at commit 
[`c28fd05`](https://github.com/apache/spark/commit/c28fd05f259a681a74ab34d2be1818c205bf29a9).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22659: [SPARK-25623][TEST] Reduce test time of LogisticRegressi...

2018-10-07 Thread shahidki31
Github user shahidki31 commented on the issue:

https://github.com/apache/spark/pull/22659
  
In the test, "multinomial logistic regression with intercept with 
elasticnet regularization" in the "LogisticRegressionSuite", taking around 1 
minute to train 2 logistic regression model.
However after analyzing the training cost over iteration, we can reduce the 
computation time by 50%.
Training cost vs iteration for model 1


![image](https://user-images.githubusercontent.com/23054875/46590546-c496e880-cad1-11e8-8539-5bc9853c33ca.png)


So, model1 is converging after iteration 200.

Training cost vs iteration for model 2:
image

![image](https://user-images.githubusercontent.com/23054875/46590551-ca8cc980-cad1-11e8-8e83-24ad220e1618.png)

After around 50 iteration, model2 is converging.
So, if we give maximum iteration for model1 and model2 as 220 and 90 
respectively, we can reduce the computation time by half.

Computation time in local setup :
Before change:
~54 sec
After change:
~35 sec


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22659: [SPARK-25623][TEST] Reduce test time of LogisticRegressi...

2018-10-07 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22659
  
**[Test build #97093 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97093/testReport)**
 for PR 22659 at commit 
[`3d9673e`](https://github.com/apache/spark/commit/3d9673e4014872b3b0583b86e134bcbdd27f6e39).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22659: [SPARK-25623][TEST] Reduce test time of LogisticRegressi...

2018-10-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22659
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22659: [SPARK-25623][TEST] Reduce test time of LogisticRegressi...

2018-10-07 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22659
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97087/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22659: [SPARK-25623][TEST] Reduce test time of LogisticRegressi...

2018-10-07 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22659
  
**[Test build #97087 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97087/testReport)**
 for PR 22659 at commit 
[`2040ada`](https://github.com/apache/spark/commit/2040ada029bc8f8b894b724706acb0450c2874b5).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22659: [SPARK-25623][TEST] Reduce test time of LogisticRegressi...

2018-10-07 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22659
  
**[Test build #97087 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97087/testReport)**
 for PR 22659 at commit 
[`2040ada`](https://github.com/apache/spark/commit/2040ada029bc8f8b894b724706acb0450c2874b5).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22659: [SPARK-25623][TEST] Reduce test time of LogisticRegressi...

2018-10-07 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22659
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22659: [SPARK-25623][TEST] Reduce test time of LogisticRegressi...

2018-10-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22659
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22659: [SPARK-25623][TEST] Reduce test time of LogisticRegressi...

2018-10-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22659
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22659: [SPARK-25623][TEST] Reduce test time of LogisticRegressi...

2018-10-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22659
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org