Another option is TupleWritable. But pull the source and make sure it works, I had problems.
On Sun, Feb 12, 2012 at 9:22 AM, Jeff Eastman <j...@windwardsolutions.com> wrote: > This approach worked out, not exactly as below, but I was able to create a > ClusterWritable which used PolymorphicWritable to read and write its Cluster > value field. This makes it through the mapper and reducer but I'm still > working on getting it all to fly in the ClusterIterator. > > > On 2/12/12 9:43 AM, Raphael Cendrillon wrote: >> >> Hi Jeff, >> >> It's great to see some discussion on this. I ran into a similar problem >> when trying to make the SplitInput job work for any arbitrary key and value >> classes. In the end I was able to side step the issue by just reading the >> key and value classes from the SequenceFileInput, but I never found a way to >> deal with this head on. >> >> On 12 Feb, 2012, at 8:35 AM, Jeff Eastman wrote: >> >>> Thanks Sean& Ted. That is what I've observed experimentally. I was going >>> to pursue a ClusterWriteable along the lines of VectorWritable but will try >>> PolymorphicWritable<Cluster> first. Looking at it, I see it does send the >>> class name which might be onerous as Sean observed except for the fact that >>> I am only sending (k) clusters between each mapper and the reducer. I will >>> report on this an an hour or so. >>> >>> >>> On 2/12/12 9:01 AM, Ted Dunning wrote: >>>> >>>> But this sounds like a runtime problem, not a type checking problem. >>>> >>>> Polymorphism is generally a problem in the Hadoop API. That is why we >>>> have VectorWritable and why I added PolymorphicWritable. >>>> >>>> Jeff, >>>> >>>> Two questions: >>>> >>>> 1) would PolymorphicWritable<Cluster> help? >>>> >>>> 2) can you say more about what the IOException is? Does it give any >>>> hints? >>>> >>>> On Sun, Feb 12, 2012 at 7:00 AM, Paritosh Ranjan<pran...@xebia.com> >>>> wrote: >>>> >>>>> Can something like this help? >>>>> >>>>> public class CIMapper<T extends Cluster> extends >>>>> Mapper<WritableComparable<?>,**VectorWritable,IntWritable,T> { >>>>> ... >>>>> } >>>>> >>>>> On 12-02-2012 06:48, Jeff Eastman wrote: >>>>> >>>>>> I'm wondering how to tease the elephant into accepting any concrete >>>>>> instance of the interface o.a.m.clustering.Cluster when writing >>>>>> trained >>>>>> clusters in the cleanup() method of CIMapper. I've gotten the MR >>>>>> version of >>>>>> the ClusterIterator to get to that point in testing but it blows >>>>>> chunks >>>>>> with an IOException when I try to pass a >>>>>> o.a.m.clustering.kmeans.**Cluster >>>>>> (I will rename the latter for 0.7). Seems the MapTask.collect() wants >>>>>> == >>>>>> and not instanceof. >>>>>> >>>>>> I've talked with Ted about passing Clusters rather than the current >>>>>> ClusterObservations but don't see how at this point. Any ideas? >>>>>> >>>>>> >>>>>> >> >> > -- Lance Norskog goks...@gmail.com