[jira] [Comment Edited] (SPARK-16768) pyspark calls incorrect version of logistic regression

Colin Beckingham (JIRA) Thu, 28 Jul 2016 19:10:11 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-16768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15398571#comment-15398571
 ]


Colin Beckingham edited comment on SPARK-16768 at 7/29/16 2:08 AM:
-------------------------------------------------------------------

This is very strange then. I can launch Spark 2.1.0 with pyspark, run "from 
pyspark.mllib.classification import LogisticRegressionWithLBFGS" and the import 
succeeds, and I can call help on the import and get a description of what it 
does. If there is no longer an LBFGS version should not the import fail with 
some warning that the command is deprecated? I see from 
http://spark.apache.org/docs/latest/mllib-optimization.html that implementation 
of LBFGS is an issue that is "being worked on". It raises the issue of whether 
the currently working version in 1.6.2 is reliable; right now running the same 
problem on both 1.6.2 and 2.1.0 produces a much faster and accurate result on 
the former.


was (Author: colbec):
This is very strange then. I can launch Spark 2.1.0 with pyspark, run "from 
pyspark.mllib.classification import LogisticRegressionWithLBFGS" and the import 
succeeds, and I can call help on the import and get a description of what it 
does. If there is no longer an LBGFS version should not the import fail with 
some warning that the command is deprecated? I see from 
http://spark.apache.org/docs/latest/mllib-optimization.html that implementation 
of LGBFS is an issue that is "being worked on". It raises the issue of whether 
the currently working version in 1.6.2 is reliable; right now running the same 
problem on both 1.6.2 and 2.1.0 produces a much faster and accurate result on 
the former.

> pyspark calls incorrect version of logistic regression
> ------------------------------------------------------
>
>                 Key: SPARK-16768
>                 URL: https://issues.apache.org/jira/browse/SPARK-16768
>             Project: Spark
>          Issue Type: Bug
>          Components: MLlib, PySpark
>         Environment: Linux openSUSE Leap 42.1 Gnome
>            Reporter: Colin Beckingham
>             Fix For: 2.1.0
>
>
> PySpark call with Spark 1.6.2 "LogisticRegressionWithLBFGS.train()"  runs 
> "treeAggregate at LBFGS.scala:218" but the same command in pyspark with Spark 
> 2.1 runs "treeAggregate at LogisticRegression.scala:1092". This non-optimized 
> version is much slower and produces a different answer from LBFGS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-16768) pyspark calls incorrect version of logistic regression

Reply via email to