On Wed, Mar 29, 2017 at 9:10 AM, Dmitriy Lyubimov <dlie...@gmail.com> wrote:
> Sorry, i think more commonly if aggregating transpose is to be used, then > cenroid assignments are better be the key of the matrix D (so D:= A) and > aggregating transpose is performed on a matrix (1 | D)' (i.e., 1 cbind > D).t so that the first row of result contains counts of cluster points and > we can finish up cluster assignment via > > M = (1 | D)' > C = M(:,2:) with each row hadamard-divided by first row of counts M(:,1) > (implying Golub-Van Loan notations for subblocking) > Argh. another way around. this should of course read C = M (2:,:) each row using hadamard division by M(1,:) in Golub/Van Loan notation 1 | D means 1 cbind D in Samsara's speak. Slicing is explained in the manual; note that in samsara rows start with 0 not 1, as in common notations or R. Implied is that M(1,:) should be collected as a simple vector and then broadcasted to M(2:,:) with actual row-wise division being done in M(2:,:).mapBlock(){...}. > On Wed, Mar 29, 2017 at 9:02 AM, Dmitriy Lyubimov <dlie...@gmail.com> > wrote: > >> the simplest scheme is to initialize distributed matrix of the shape D := >> (0 | A) where A is your dataset and 0 is a single column indicating current >> centroid assignment and distribute current centroid matrix C via matrix >> broadcast (assuming there are few enough centers). >> >> Then alternatively run cluster assignment within mapBlock() operator on D >> with recomputation of new centroids C afterwards. Recomputation of >> centroids can be done via aggregating transpose. >> >> of course a better scheme includes pre-sketching (k-means ||) and use of >> a triangle inequality during recomputations. >> >> On Wed, Mar 29, 2017 at 8:30 AM, KHATWANI PARTH BHARAT < >> h2016...@pilani.bits-pilani.ac.in> wrote: >> >>> Sir, >>> I am trying to write the kmeans clustering algorithm using Mahout Samsara >>> but i am bit confused >>> about how to leverage Distributed Row Matrix for the same. Can anybody >>> help >>> me with same. >>> >>> >>> >>> >>> >>> Thanks >>> Parth Khatwani >>> >> >> >