Hi, 

I'm new to Mahout (and machine learning) but did quite a lot of reading,
especially "Mahout in Action". 

I'm trying to cluster users based on their profiles. 
By profile I mean attributes such as: age, gender, location and set of
interests 

All the examples I saw so far were about vectors having dimensions of
similar type (e.g. occurance of words in text) but in my case, each
dimension is of different "type" and seem to require different distance
measure. 

* Gender has two possible values - what distance measure should I use here? 
* Age has a larger set of possible values - euclidean distance? 
* Location, expressed as latitude and longitude - euclidean distance, but
between pairs of points? 
* Interests, if expressed as a subset of a finite set - so the distance
number of items, shared between two vectors. I assume I can write a custom
distance measure for it. 

Other than deciding on correct distance measures, I'm not sure how to
combine them into one clustering process. 

As I said, I'm new to this field so any help would be much appreciated. 

Thanks, 
Raviv

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Clustering-user-profiles-tp3654678p3654678.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Reply via email to