Xiangrui, Christopher, Thanks for responding. I'll go through the code in detail to evaluate if the loss function used is suitable to our dataset. I'll also go through the referred paper since I was unaware of the underlying theory. Thanks again.
-Bharath On Thu, May 29, 2014 at 8:16 AM, Christopher Nguyen <c...@adatao.com> wrote: > Bharath, (apologies if you're already familiar with the theory): the > proposed approach may or may not be appropriate depending on the overall > transfer function in your data. In general, a single logistic regressor > cannot approximate arbitrary non-linear functions (of linear combinations > of the inputs). You can review works by, e.g., Hornik and Cybenko in the > late 80's to see if you need something more, such as a simple, one > hidden-layer neural network. > > This is a good summary: > > http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.101.2647&rep=rep1&type=pdf > > -- > Christopher T. Nguyen > Co-founder & CEO, Adatao <http://adatao.com> > linkedin.com/in/ctnguyen > > > > On Wed, May 28, 2014 at 11:18 AM, Bharath Ravi Kumar <reachb...@gmail.com > >wrote: > > > I'm looking to reuse the LogisticRegression model (with SGD) to predict a > > real-valued outcome variable. (I understand that logistic regression is > > generally applied to predict binary outcome, but for various reasons, > this > > model suits our needs better than LinearRegression). Related to that I > have > > the following questions: > > > > 1) Can the current LogisticRegression model be used as is to train based > on > > binary input (i.e. explanatory) features, or is there an assumption that > > the explanatory features must be continuous? > > > > 2) I intend to reuse the current class to train a model on LabeledPoints > > where the label is a real value (and not 0 / 1). I'd like to know if > > invoking setValidateData(false) would suffice or if one must override the > > validator to achieve this. > > > > 3) I recall seeing an experimental method on the class ( > > > > > https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/classification/LogisticRegression.scala > > ) > > that clears the threshold separating positive & negative predictions. > Once > > the model is trained on real valued labels, would clearing this flag > > suffice to predict an outcome that is continous in nature? > > > > Thanks, > > Bharath > > > > P.S: I'm writing to dev@ and not user@ assuming that lib changes might > be > > necessary. Apologies if the mailing list is incorrect. > > >