Hi All,

I am new to mahout and just want to understand below.

I would like to know, why mahout clustering algorithms need numerical
vectorization of actual records(like json etc)?

When we have a record with mixed data types and if we convert it into
numerical vector, we may not be able to apply field wise comparisons and
also maintaing mapping b/w actual record and vector also a problem.

Is it numerical vectorization only for performance optimization? or is
there any other reason.

Does it make sense to apply clustering directly on actual records?


Thanks & Regards,
B Anil Kumar.

Reply via email to