[GitHub] spark issue #17862: [SPARK-20602] [ML]Adding LBFGS optimizer and Squared_hin...

hhbyyh Tue, 12 Sep 2017 09:14:01 -0700

Github user hhbyyh commented on the issue:

    https://github.com/apache/spark/pull/17862
  
    Tested with several larger data set with Hinge Loss function, to compare 
l-bfgs and owlqn solvers.
    Run until converged or exceed maxIter (2000).
    
    dataset | numRecords | numFeatures | l-bfgs iterations | owlqn iterations | 
l-bfgs final loss | owlqn final loss
    -------- | 
---------------|---------------|---------------|---------------|---------------|---------------
    url_combined | 2396130 | 3231961 | 317 (952 sec) | 287 (1661 sec) | 
9.71E-5| 1.64E-4
    kdda | 8407752 | 20216830 | 2000+ (29729 sec) | 288 13664 (sec) |  0.0068 | 
0.0135
    webspam | 350000 | 254 | 344 (67 sec) | 1502 (714 sec) | 0.18273 | 0.18273
    SUSY | 5000000 | 18 | 152 (145 sec) | 1242 (3357 sec) |  0.499 | 0.499
    
    l-bfgs does not always take fewer iterations, but it converges to a smaller 
final loss.
    For each iteration, owlqn takes longer time ( 2 or 3 times) than l-bfgs. 
Logistic Regression also exhibits the similar behavior.




---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #17862: [SPARK-20602] [ML]Adding LBFGS optimizer and Squared_hin...

Reply via email to