Hi Benson,
We are in the same boat on old dogs and new tricks. Because of the
potentially large volume of points that can be clustered by the MR
implementation, storing the points with the canopies won't scale. Thus,
the CanopyClusteringJob does two passes: the CanopyDriver.runJob pass
proces
Should ref implementations be in the examples project or in the core, or in
some third location?
On Sun, May 31, 2009 at 12:21 PM, Ted Dunning wrote:
> Yes.
>
> On Sun, May 31, 2009 at 7:09 AM, Benson Margulies >wrote:
>
> > Should the ref implementation be in a class by itself?
>
>
>
>
> --
>
Jeff,
I'm an old dog who has been taught a certain number of machine learning new
tricks. There's a common thread to the Canopy and KMeans code that has me
doing a certain amount of head-scratching.
The Canopy class doesn't keep a reference to the points in the canopy. But
someone must. Or is can
Yes.
On Sun, May 31, 2009 at 7:09 AM, Benson Margulies wrote:
> Should the ref implementation be in a class by itself?
--
Ted Dunning, CTO
DeepDyve
Question:
What's the role of a reference implementation embedded in a GUI?
I think I can patch up the implementation in DisplayKMeans easily enough.
Should the ref implementation be in a class by itself?
--be
On Sat, May 30, 2009 at 8:22 PM, Jeff Eastman wrote:
> I think you are actually corre
I think you are actually correct about the reference implementation that
is used in the tests and that example. I was looking at the
Canopy.addPointToCanopies() method which does add a new canopy if there
are none that are strongly bound (suggest a fix?
Jeff
Benson Margulies wrote:
I'll loo
I'll look at the copy in DisplayKMeans again and see if it is missing that
last test.
On Sat, May 30, 2009 at 12:41 PM, Jeff Eastman
wrote:
> Canopy tests each point against the current set of canopies, adding the
> point to each canopy that is within t1 and finally stopping when it finds
> one w
Canopy tests each point against the current set of canopies, adding the
point to each canopy that is within t1 and finally stopping when it
finds one within t2. If all canopies are tested and none are within t2
then a new canopy is added with the point as its center. So, even if you
set t1 and