Re: [Spark ML] Positive-Only Training Classification in Scala

2018-01-16 Thread Matt Hicks
If I try to use LogisticRegression with only positive training it always gives me positive results: Positive Only private def positiveOnly(): Unit = {val training = spark.createDataFrame(Seq( (1.0, Vectors.dense(0.0, 1.1, 0.1)), (1.0, Vectors.dense(0.0, 1.0,

Re: [Spark ML] Positive-Only Training Classification in Scala

2018-01-16 Thread Matt Hicks
Hi Hari, I'm not sure I understand.  I apologize, I'm still pretty new to Spark and Spark ML.  Can you point me to some example code or documentation that would more fully represent this? Thanks On Tue, Jan 16, 2018 2:54 AM, hosur narahari hnr1...@gmail.com wrote: You can make use of

Re: [Spark ML] Positive-Only Training Classification in Scala

2018-01-16 Thread hosur narahari
You can make use of probability vector from spark classification. When you run spark classification model for prediction, along with classifying into its class spark also gives probability vector(what's the probability that this could belong to each individual class) . So just take the probability

Re: [Spark ML] Positive-Only Training Classification in Scala

2018-01-15 Thread Georg Heiler
I do not know that module, but in literature PUL is the exact term you should look for. Matt Hicks schrieb am Mo., 15. Jan. 2018 um 20:56 Uhr: > Is it fair to assume this is what I need? > https://github.com/ispras/pu4spark > > > > On Mon, Jan 15, 2018 1:55 PM, Georg Heiler

Re: [Spark ML] Positive-Only Training Classification in Scala

2018-01-15 Thread Matt Hicks
Is it fair to assume this is what I need? https://github.com/ispras/pu4spark On Mon, Jan 15, 2018 1:55 PM, Georg Heiler georg.kf.hei...@gmail.com wrote: As far as I know spark does not implement such algorithms. In case the dataset is small

Re: [Spark ML] Positive-Only Training Classification in Scala

2018-01-15 Thread Georg Heiler
As far as I know spark does not implement such algorithms. In case the dataset is small http://scikit-learn.org/stable/modules/generated/sklearn.svm.OneClassSVM.html might be of interest to you. Jörn Franke schrieb am Mo., 15. Jan. 2018 um 20:04 Uhr: > I think you look

Re: [Spark ML] Positive-Only Training Classification in Scala

2018-01-15 Thread Jörn Franke
I think you look more for algorithms for unsupervised learning, eg clustering. Depending on the characteristics different clusters might be created , eg donor or non-donor. Most likely you may find also more clusters (eg would donate but has a disease preventing it or too old). You can verify

[Spark ML] Positive-Only Training Classification in Scala

2018-01-15 Thread Matt Hicks
I'm attempting to create a training classification, but only have positive information.  Specifically in this case it is a donor list of users, but I want to use it as training in order to determine classification for new contacts to give probabilities that they will donate. Any insights or links