ow you can group this dataset by cluster id and aggregate over the original 5
features. E.g., get the mean for numerical data or the value that occurs the
most for categorical data.
The exact aggregation is use-case dependent.
I hope this helps,Christoph
Am 01.03.2018 21:40 schrieb "Matt
I'm using K Means clustering for a project right now, and it's working very
well. However, I'd like to determine from the clusters what information
distinctions define each cluster so I can explain the "reasons" data fits into a
specific cluster.
Is there a proper way to do this in Spark ML?
nfo] ([-1.0,1.5,1.3], 1.0) -> prob=[0.0,1.0], prediction=1.0[info]
([3.0,2.0,-0.1], 0.0) -> prob=[0.0,1.0], prediction=1.0[info] ([0.0,2.2,-1.5],
1.0) -> prob=[0.0,1.0], prediction=1.0
On Tue, Jan 16, 2018 8:51 AM, Matt Hicks m...@outr.com wrote:
Hi Hari, I'm not sure I und
class. And it'll be same as what's the probability the a person will
become donor.
Best Regards,Hari
On 15 Jan 2018 11:51 p.m., "Matt Hicks" <m...@outr.com> wrote:
I'm attempting to create a training classification, but only have positive
information. Specifically in this case it
5. Jan 2018, at 19:21, Matt Hicks <m...@outr.com> wrote:
I'm attempting to create a training classification, but only have positive
information. Specifically in this case it is a donor list of users, but I want
to use it as training in order to determine classification for new cont
are appreciated. I've gone through the documentation but
have been unable to find any references to how I might do this.
Thanks
---
Matt Hicks
Chief Technology Officer
405.283.6887 | http://outr.com