Hi all, Just a couple questions:
1) The first and second iterations of my algorithm are, for all practical purposes, two independent spectral clustering algorithms. As such, I'd like to try and keep both intact. However, since many of their computations are identical, I have been considering different ways for re-using the utilities they share. One method Isabel suggested was to create a superpackage "o.a.m.clustering.spectral" with subpackages for the algorithms and shared utilities. Another method - the reason I wanted to pose this question to the list - is making these utilities global, perhaps in the mahout.math package. These utilities include a map/reduce task for reading an input graph into a DistributedRowMatrix, a task for creating a diagonal DRM by summing rows of an input DRM, and a task for converting the rows of DRM to unit vector length. Would these tasks be useful on a global level? Or should I stick to keeping them in my own subpackage? 2) Is anyone familiar with non-maximal suppression, specifically within the context of graphs? I'm having a difficult time wrapping my mind around this step. Given a sensitivity S_ij (which is a function of edge weights within a graph, or in this case, a value within a matrix), it needs to be suppressed if there is a strictly more negative value S_mj or S_in for some vertex v_m in the neighborhood of v_j, or some v_n in the neighborhood of v_i. To me, this seems simply a case of finding all nodes connected to the vertices v_i and v_j, and if any of the edges to those other vertices yield a S_mj (if connected to v_j) or S_in (if connected to v_i) that is less than S_ij, then S_ij is to be "suppressed". Which, from what I can tell, is simply a method for flagging something; the value isn't changed if it is suppressed. tl;dr version: within an immediate neighborhood of nodes, we're looking for the local minimum, and flagging everything that isn't the local minimum. Is this accurate? Thanks! Regards, Shannon
