That would be a very, very good thing (uniform data usage).

On Sat, Apr 17, 2010 at 2:52 PM, Jake Mannix <jake.man...@gmail.com> wrote:

> Currently, FuzzyKMeansClusterMapper has WritableComparable<?>
> keys which are ignored.  Could we instead have the identifier for the
> vector live there, where it makes sense?  Then that same key could
> be mapper output key, instead of the name of the Vector.
>
> This kind of change could get the clustering code to effectively be
> able to run sensibly on the same SequenceFile<IntWritable,VectorWritable>
> that DistributedRowMatrix is running on, and that would be very nice,
> I think.
>

Reply via email to