Re: A question about Mapper

2008-10-04 Thread Zhou, Yunqing
Thanks a lot for such a detailed explanation.but I think the reducer here is
unnecessary, so I set the reducer number to 0.
then, I'd like to solve them all in mappers.
so I met with the problem.
Thanks anyway.

On Sat, Oct 4, 2008 at 3:33 PM, Joman Chu <[EMAIL PROTECTED]> wrote:

> Hello,
>
> I assume you want to associate {a,b}, {c,d,e}, and {f} into sets.
>
> One way to do this is by associating some value with each flag and then
> emitting the data associated with that value. For example,
>
> flag
> a
> b
> flag
> c
> d
> e
> flag
> f
>
> I define flag,a,b,c,d,e,f to be the key while in the Mapper context.
>
> Whenever the mapper sees a key, it will emit . UID is some unique
> identifier associated with a certain set, and Key is the key that was passed
> into the mapper. We are essentially inverting the association here.
>
> Let's step through this testcase.
>  1. Choose UID = mapper1flag1.
>  2.  -> Mapper -> 
>  3. We have reached a flag, so we change the UID = mapper1flag2.
>  4.  -> Mapper -> 
>  5.  -> Mapper -> 
>  6.  -> Mapper -> 
>  7. We have reached a flag, so we change the UID = mapper1flag3.
>  8.  -> Mapper -> 
>  9.  -> Mapper -> 
> 10.  -> Mapper -> 
> 11.  -> Mapper -> 
> 12. We have reached a flag, so we change the UID = mapper1flag4.
> 13.  -> Mapper -> 
> 14. EOF
>
> Then the reducers will collect all values with the same UID, so here is
> what we get:
>
> 1.  -> Reducer -> <{}, null>
> 2.  -> Reducer -> <{a,b}, null>
> 3.  -> Reducer -> <{c,d,e}, null>
> 4.  -> Reducer -> <{f}, null>
>
> Hopefully this solves your problem.
>
> On Sat, October 4, 2008 2:48 am, Zhou, Yunqing said:
> > but the close() function doesn't supply me a Collector to put pairs in.
> >
> > Is it reasonable for me to store a reference of the collector in advance?
> >
> >
> > I'm not sure if the collector is still available then.
> >
> >
> >
> >
> > On Sat, Oct 4, 2008 at 12:17 PM, Joman Chu <[EMAIL PROTECTED]>
> wrote:
> >
> >
> >> Hello,
> >>
> >> Does MapReduceBase.close() fit your needs? Take a look at
> >> http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/mapred
> >> /MapReduceBase.html#close()
> >>
> >> On Fri, October 3, 2008 11:36 pm, Zhou, Yunqing said:
> >>> the input is as follows. flag a b flag c d e flag f
> >>>
> >>> then I used a mapper to first store values and then emit them all
> >>> when met with a line contains "flag" but when the file reached its
> >>> end, I have no chance to emit the last record.(in this case ,f) so how
> >>> can I detect
> >> the
> >>> mapper's end of its life , or how can I emit a last record before a
> >> mapper
> >>> exits.
> >>>
> >>> Thanks
> >>>
> >>
> >> Have a good one, -- Joman Chu Carnegie Mellon University School of
> Computer
> >> Science 2011 AIM: ARcanUSNUMquam
> >>
> >>
> >
>
>
> --
> Joman Chu
> Carnegie Mellon University
> School of Computer Science 2011
> AIM: ARcanUSNUMquam
>
>


Re: A question about Mapper

2008-10-04 Thread Joman Chu
Hello,

I assume you want to associate {a,b}, {c,d,e}, and {f} into sets.

One way to do this is by associating some value with each flag and then 
emitting the data associated with that value. For example,

flag
a
b
flag
c
d
e
flag
f

I define flag,a,b,c,d,e,f to be the key while in the Mapper context.

Whenever the mapper sees a key, it will emit . UID is some unique 
identifier associated with a certain set, and Key is the key that was passed 
into the mapper. We are essentially inverting the association here.

Let's step through this testcase.
 1. Choose UID = mapper1flag1.
 2.  -> Mapper -> 
 3. We have reached a flag, so we change the UID = mapper1flag2.
 4.  -> Mapper -> 
 5.  -> Mapper -> 
 6.  -> Mapper -> 
 7. We have reached a flag, so we change the UID = mapper1flag3.
 8.  -> Mapper -> 
 9.  -> Mapper -> 
10.  -> Mapper -> 
11.  -> Mapper -> 
12. We have reached a flag, so we change the UID = mapper1flag4.
13.  -> Mapper -> 
14. EOF

Then the reducers will collect all values with the same UID, so here is what we 
get:

1.  -> Reducer -> <{}, null>
2.  -> Reducer -> <{a,b}, null>
3.  -> Reducer -> <{c,d,e}, null>
4.  -> Reducer -> <{f}, null>

Hopefully this solves your problem.

On Sat, October 4, 2008 2:48 am, Zhou, Yunqing said:
> but the close() function doesn't supply me a Collector to put pairs in.
> 
> Is it reasonable for me to store a reference of the collector in advance?
> 
> 
> I'm not sure if the collector is still available then.
> 
> 
> 
> 
> On Sat, Oct 4, 2008 at 12:17 PM, Joman Chu <[EMAIL PROTECTED]> wrote:
> 
> 
>> Hello,
>> 
>> Does MapReduceBase.close() fit your needs? Take a look at 
>> http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/mapred
>> /MapReduceBase.html#close()
>> 
>> On Fri, October 3, 2008 11:36 pm, Zhou, Yunqing said:
>>> the input is as follows. flag a b flag c d e flag f
>>> 
>>> then I used a mapper to first store values and then emit them all
>>> when met with a line contains "flag" but when the file reached its
>>> end, I have no chance to emit the last record.(in this case ,f) so how
>>> can I detect
>> the
>>> mapper's end of its life , or how can I emit a last record before a
>> mapper
>>> exits.
>>> 
>>> Thanks
>>> 
>> 
>> Have a good one, -- Joman Chu Carnegie Mellon University School of Computer
>> Science 2011 AIM: ARcanUSNUMquam
>> 
>> 
> 


-- 
Joman Chu
Carnegie Mellon University
School of Computer Science 2011
AIM: ARcanUSNUMquam



Re: A question about Mapper

2008-10-03 Thread Zhou, Yunqing
but the close() function doesn't supply me a Collector to put pairs in.

Is it reasonable for me to store a reference of the collector in advance?

I'm not sure if the collector is still available then.




On Sat, Oct 4, 2008 at 12:17 PM, Joman Chu <[EMAIL PROTECTED]> wrote:

> Hello,
>
> Does MapReduceBase.close() fit your needs? Take a look at
> http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/mapred/MapReduceBase.html#close()
>
> On Fri, October 3, 2008 11:36 pm, Zhou, Yunqing said:
> > the input is as follows. flag a b flag c d e flag f
> >
> > then I used a mapper to first store values and then emit them all when
> > met with a line contains "flag" but when the file reached its end, I have
> > no chance to emit the last record.(in this case ,f) so how can I detect
> the
> > mapper's end of its life , or how can I emit a last record before a
> mapper
> > exits.
> >
> > Thanks
> >
>
> Have a good one,
> --
> Joman Chu
> Carnegie Mellon University
> School of Computer Science 2011
> AIM: ARcanUSNUMquam
>
>


Re: A question about Mapper

2008-10-03 Thread Joman Chu
Hello,

Does MapReduceBase.close() fit your needs? Take a look at 
http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/mapred/MapReduceBase.html#close()

On Fri, October 3, 2008 11:36 pm, Zhou, Yunqing said:
> the input is as follows. flag a b flag c d e flag f
> 
> then I used a mapper to first store values and then emit them all when
> met with a line contains "flag" but when the file reached its end, I have
> no chance to emit the last record.(in this case ,f) so how can I detect the
> mapper's end of its life , or how can I emit a last record before a mapper
> exits.
> 
> Thanks
> 

Have a good one,
-- 
Joman Chu
Carnegie Mellon University
School of Computer Science 2011
AIM: ARcanUSNUMquam