Yeah, Canopy issue is sorted out. Was thinking of adding a flag to add point
to a single canopy instead of adding it to all canopies. This would help a
lot on large datasets. There is no point of adding to all canopies, you will
get approximate clustering anyways

I have cleaned up most of SoftCluster. Still the error exists. It seems to
be looping forever now. I will post a patch on the issue take please take a
look

Robin

On Wed, Feb 17, 2010 at 3:35 PM, Jeff Eastman <j...@windwardsolutions.com>wrote:

> Robin Anil wrote:
>
>> Hadoop reuses the *same* instance whenever it uses readFields and I've
>>> been
>>> bitten more than once by assuming otherwise.
>>>
>>>
>>
>> Yep!. Thats our bug. Always assume mutability in Hadoop :) . I will see
>> the
>> where the writable is causing the error.
>> Best is if we could have some test data and make a check to see if the
>> algorithm is working.
>>
>>
>>
> Good hunting. I notice that some of the code in the fuzzy MR unit test has
> been commented out but have not looked into it further.
>
> I assume also you have sorted out the canopy issue you were having?
>
> Jeff
>

Reply via email to