[ https://issues.apache.org/jira/browse/SPARK-16768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15398571#comment-15398571 ]
Colin Beckingham edited comment on SPARK-16768 at 7/29/16 2:08 AM: ------------------------------------------------------------------- This is very strange then. I can launch Spark 2.1.0 with pyspark, run "from pyspark.mllib.classification import LogisticRegressionWithLBFGS" and the import succeeds, and I can call help on the import and get a description of what it does. If there is no longer an LBFGS version should not the import fail with some warning that the command is deprecated? I see from http://spark.apache.org/docs/latest/mllib-optimization.html that implementation of LBFGS is an issue that is "being worked on". It raises the issue of whether the currently working version in 1.6.2 is reliable; right now running the same problem on both 1.6.2 and 2.1.0 produces a much faster and accurate result on the former. was (Author: colbec): This is very strange then. I can launch Spark 2.1.0 with pyspark, run "from pyspark.mllib.classification import LogisticRegressionWithLBFGS" and the import succeeds, and I can call help on the import and get a description of what it does. If there is no longer an LBGFS version should not the import fail with some warning that the command is deprecated? I see from http://spark.apache.org/docs/latest/mllib-optimization.html that implementation of LGBFS is an issue that is "being worked on". It raises the issue of whether the currently working version in 1.6.2 is reliable; right now running the same problem on both 1.6.2 and 2.1.0 produces a much faster and accurate result on the former. > pyspark calls incorrect version of logistic regression > ------------------------------------------------------ > > Key: SPARK-16768 > URL: https://issues.apache.org/jira/browse/SPARK-16768 > Project: Spark > Issue Type: Bug > Components: MLlib, PySpark > Environment: Linux openSUSE Leap 42.1 Gnome > Reporter: Colin Beckingham > Fix For: 2.1.0 > > > PySpark call with Spark 1.6.2 "LogisticRegressionWithLBFGS.train()" runs > "treeAggregate at LBFGS.scala:218" but the same command in pyspark with Spark > 2.1 runs "treeAggregate at LogisticRegression.scala:1092". This non-optimized > version is much slower and produces a different answer from LBFGS. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org