Re: [Edit] Approach for Clustering Data

Ted Dunning Mon, 17 Feb 2014 07:26:31 -0800

Think about the question in terms of whether this will define a reasonable
kind of distance between items or users.


Can you first define what you want to do?  Are you clustering users?  Are
you clustering items?

If users, how could the data you provide give any kind of idea about which
users are similar?

If items, where is information about the item?




On Mon, Feb 17, 2014 at 2:25 AM, Bikash Gupta <bikash.gupt...@gmail.com>wrote:

> Hi,
>
> Just to clear my below question I am citing an another example
>
> Let say I will be clustering on any User's monthly summarized data
>
> UserID, Transaction, Quantity, Discount
>
> Question 1) If I input UserID, Transaction, Quantity, Discount in
> Kmeans, will the output would be accurate as ideally UserId shouldn't
> have participated
>
> Question2) If I input Transaction, Quantity, Discount in Kmeans, how I
> will map UserId with output clustered data
>
>
> Request you all to help me with the basic problem that I am facing in
> data mining.
>
> Regards
> Bikash
>
> On Fri, Feb 14, 2014 at 11:25 PM, Bikash Gupta <bikash.gupt...@gmail.com>
> wrote:
> > I am newbie to Mahout and working on a data mining clustering use case
> > using K-Means. I need a help to understand how to map the original
> > data with the clustered output to gain more insight. Let say
> >
> > After performing data preparation we have a summarized data set having
> > following attributes
> >
> > Key1,Key2,Dimension1,Dimension2,Measure1,Measure2,Measure3
> >
> > Now I have executed clustering algorithm on following attributes
> >
> > Measure1,Measure2,Measure3
> >
> > Output of the clustering would be Cluster Id with its
> > data(Measure1,Measure2,Measure3).
> >
> > Question: How can I perform clustering on specific attributes in
> > dataset, where the clustered output must contain all attributes.
> >
> > Request to help me with right approach.
> >
> > --
> > Regards
> > Bikash Gupta
>

Re: [Edit] Approach for Clustering Data

Reply via email to