Re: K Means Clustering Explanation

2018-03-02 Thread Matt Hicks
ow you can group this dataset by cluster id and aggregate over the original 5 features. E.g., get the mean for numerical data or the value that occurs the most for categorical data. The exact aggregation is use-case dependent. I hope this helps,Christoph Am 01.03.2018 21:40 schrieb "Matt

K Means Clustering Explanation

2018-03-01 Thread Matt Hicks
I'm using K Means clustering for a project right now, and it's working very well.  However, I'd like to determine from the clusters what information distinctions define each cluster so I can explain the "reasons" data fits into a specific cluster. Is there a proper way to do this in Spark ML?

Re: [Spark ML] Positive-Only Training Classification in Scala

2018-01-16 Thread Matt Hicks
nfo] ([-1.0,1.5,1.3], 1.0) -> prob=[0.0,1.0], prediction=1.0[info] ([3.0,2.0,-0.1], 0.0) -> prob=[0.0,1.0], prediction=1.0[info] ([0.0,2.2,-1.5], 1.0) -> prob=[0.0,1.0], prediction=1.0 On Tue, Jan 16, 2018 8:51 AM, Matt Hicks m...@outr.com wrote: Hi Hari, I'm not sure I und

Re: [Spark ML] Positive-Only Training Classification in Scala

2018-01-16 Thread Matt Hicks
class. And it'll be same as what's the probability the a person will become donor. Best Regards,Hari On 15 Jan 2018 11:51 p.m., "Matt Hicks" <m...@outr.com> wrote: I'm attempting to create a training classification, but only have positive information.  Specifically in this case it

Re: [Spark ML] Positive-Only Training Classification in Scala

2018-01-15 Thread Matt Hicks
5. Jan 2018, at 19:21, Matt Hicks <m...@outr.com> wrote: I'm attempting to create a training classification, but only have positive information.  Specifically in this case it is a donor list of users, but I want to use it as training in order to determine classification for new cont

[Spark ML] Positive-Only Training Classification in Scala

2018-01-15 Thread Matt Hicks
are appreciated. I've gone through the documentation but have been unable to find any references to how I might do this. Thanks --- Matt Hicks Chief Technology Officer 405.283.6887 | http://outr.com