Hi, Wonderful. I was sampling the output, but with a bug. Your comment brought the realization :). I was indeed victimized by the complete separability issue :).
Thanks a lot. with regards, Nikhil On Tue, Nov 17, 2015 at 5:26 PM, DB Tsai <dbt...@dbtsai.com> wrote: > How do you compute the probability given the weights? Also, given a > probability, you need to sample positive and negative based on the > probability, and how do you do this? I'm pretty sure that the LoR will > give you correct weights, and please see the > generateMultinomialLogisticInput in > > https://github.com/apache/spark/blob/master/mllib/src/test/scala/org/apache/spark/mllib/classification/LogisticRegressionSuite.scala > > Sincerely, > > DB Tsai > ---------------------------------------------------------- > Web: https://www.dbtsai.com > PGP Key ID: 0xAF08DF8D > > > On Tue, Nov 17, 2015 at 4:11 PM, njoshi <nikhil.jo...@teamaol.com> wrote: > > I am testing the LogisticRegression performance on a synthetically > generated > > data. The weights I have as input are > > > > w = [2, 3, 4] > > > > with no intercept and three features. After training on 1000 > synthetically > > generated datapoint assuming random normal distribution for each, the > Spark > > LogisticRegression model I obtain has weights as > > > > [6.005520656096823,9.35980263762698,12.203400879214152] > > > > I can see that each weight is scaled by a factor close to '3' w.r.t. the > > original values. I am unable to guess the reason behind this. The code is > > simple enough as > > > > > > /* > > * Logistic Regression model > > */ > > val lr = new LogisticRegression() > > .setMaxIter(50) > > .setRegParam(0.001) > > .setElasticNetParam(0.95) > > .setFitIntercept(false) > > > > val lrModel = lr.fit(trainingData) > > > > > > println(s"${lrModel.weights}") > > > > > > > > I would greatly appreciate if someone could shed some light on what's > fishy > > here. > > > > with kind regards, Nikhil > > > > > > > > > > -- > > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Spark-LogisticRegression-returns-scaled-coefficients-tp25405.html > > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > > For additional commands, e-mail: user-h...@spark.apache.org > > > -- *Nikhil Joshi*Princ Data Scientist *Aol*PLATFORMS. *395 Page Mill Rd, *Palo Alto <http://www.mapquest.com/maps?city=Palo+Alto&state=CA>, CA <http://www.mapquest.com/maps?state=CA> 94306-2024 <http://www.mapquest.com/maps?zipcode=94306-2024>vvmr: 8894737