Re: Canopy Clustering not scaling

Robin Anil Sun, 02 May 2010 05:53:38 -0700

I dont think you got the algorithm correct. The canopy list is empty at
start, And automatically populated using the distance threshold, this may
work, I dont have a clue how to get till here.


On Sun, May 2, 2010 at 6:15 PM, Sean Owen <sro...@gmail.com> wrote:

> How about this for the first phase? I think you can imagine how the
> rest goes, more later...
>
>
> Mapper 1A.
> map() input: One canopy
> map() output: canopy ID -> canopy
>
> Mapper 1B.
> Has in memory all canopy IDs, read at startup)
> map() input: one point
> map() output: for each canopy ID, canopy ID -> point
>
> Reducer 1.
> reduce() input: canopy ID mapped to many points, one canopy
> reduce() output: for each point, compute distance from point to
> canopy, output (canopy ID, point ID) -> distance
>

Re: Canopy Clustering not scaling

Reply via email to