Another option is TupleWritable. But pull the source and make sure it
works, I had problems.

On Sun, Feb 12, 2012 at 9:22 AM, Jeff Eastman
<j...@windwardsolutions.com> wrote:
> This approach worked out, not exactly as below, but I was able to create a
> ClusterWritable which used PolymorphicWritable to read and write its Cluster
> value field. This makes it through the mapper and reducer but I'm still
> working on getting it all to fly in the ClusterIterator.
>
>
> On 2/12/12 9:43 AM, Raphael Cendrillon wrote:
>>
>> Hi Jeff,
>>
>> It's great to see some discussion on this. I ran into a similar problem
>> when trying to make the SplitInput job work for any arbitrary key and value
>> classes. In the end I was able to side step the issue by just reading the
>> key and value classes from the SequenceFileInput, but I never found a way to
>> deal with this head on.
>>
>> On 12 Feb, 2012, at 8:35 AM, Jeff Eastman wrote:
>>
>>> Thanks Sean&  Ted. That is what I've observed experimentally. I was going
>>> to pursue a ClusterWriteable along the lines of VectorWritable but will try
>>> PolymorphicWritable<Cluster>  first. Looking at it, I see it does send the
>>> class name which might be onerous as Sean observed except for the fact that
>>> I am only sending (k) clusters between each mapper and the reducer. I will
>>> report on this an an hour or so.
>>>
>>>
>>> On 2/12/12 9:01 AM, Ted Dunning wrote:
>>>>
>>>> But this sounds like a runtime problem, not a type checking problem.
>>>>
>>>> Polymorphism is generally a problem in the Hadoop API.   That is why we
>>>> have VectorWritable and why I added PolymorphicWritable.
>>>>
>>>> Jeff,
>>>>
>>>> Two questions:
>>>>
>>>> 1) would PolymorphicWritable<Cluster>   help?
>>>>
>>>> 2) can you say more about what the IOException is?  Does it give any
>>>> hints?
>>>>
>>>> On Sun, Feb 12, 2012 at 7:00 AM, Paritosh Ranjan<pran...@xebia.com>
>>>> wrote:
>>>>
>>>>> Can something like this help?
>>>>>
>>>>> public class CIMapper<T extends Cluster>   extends
>>>>> Mapper<WritableComparable<?>,**VectorWritable,IntWritable,T>   {
>>>>> ...
>>>>> }
>>>>>
>>>>> On 12-02-2012 06:48, Jeff Eastman wrote:
>>>>>
>>>>>> I'm wondering how to tease the elephant into accepting any concrete
>>>>>> instance of the interface o.a.m.clustering.Cluster when writing
>>>>>> trained
>>>>>> clusters in the cleanup() method of CIMapper. I've gotten the MR
>>>>>> version of
>>>>>> the ClusterIterator to get to that point in testing but it blows
>>>>>> chunks
>>>>>> with an IOException when I try to pass a
>>>>>> o.a.m.clustering.kmeans.**Cluster
>>>>>> (I will rename the latter for 0.7). Seems the MapTask.collect() wants
>>>>>> ==
>>>>>> and not instanceof.
>>>>>>
>>>>>> I've talked with Ted about passing Clusters rather than the current
>>>>>> ClusterObservations but don't see how at this point. Any ideas?
>>>>>>
>>>>>>
>>>>>>
>>
>>
>



-- 
Lance Norskog
goks...@gmail.com

Reply via email to