Re: CIMapper Question

Jeff Eastman Sun, 12 Feb 2012 08:35:56 -0800

Thanks Sean & Ted. That is what I've observed experimentally. I wasgoing to pursue a ClusterWriteable along the lines of VectorWritable butwill try PolymorphicWritable<Cluster> first. Looking at it, I see itdoes send the class name which might be onerous as Sean observed exceptfor the fact that I am only sending (k) clusters between each mapper andthe reducer. I will report on this an an hour or so.


On 2/12/12 9:01 AM, Ted Dunning wrote:

But this sounds like a runtime problem, not a type checking problem.


Polymorphism is generally a problem in the Hadoop API.   That is why we
have VectorWritable and why I added PolymorphicWritable.

Jeff,

Two questions:

1) would PolymorphicWritable<Cluster>  help?

2) can you say more about what the IOException is?  Does it give any hints?

On Sun, Feb 12, 2012 at 7:00 AM, Paritosh Ranjan<[email protected]>  wrote:

Can something like this help?

public class CIMapper<T extends Cluster>  extends
Mapper<WritableComparable<?>,**VectorWritable,IntWritable,T>  {
...
}

On 12-02-2012 06:48, Jeff Eastman wrote:

I'm wondering how to tease the elephant into accepting any concrete
instance of the interface o.a.m.clustering.Cluster when writing trained
clusters in the cleanup() method of CIMapper. I've gotten the MR version of
the ClusterIterator to get to that point in testing but it blows chunks
with an IOException when I try to pass a o.a.m.clustering.kmeans.**Cluster
(I will rename the latter for 0.7). Seems the MapTask.collect() wants ==
and not instanceof.

I've talked with Ted about passing Clusters rather than the current
ClusterObservations but don't see how at this point. Any ideas?

Re: CIMapper Question

Reply via email to