Re: A question about Mapper
Thanks a lot for such a detailed explanation.but I think the reducer here is unnecessary, so I set the reducer number to 0. then, I'd like to solve them all in mappers. so I met with the problem. Thanks anyway. On Sat, Oct 4, 2008 at 3:33 PM, Joman Chu <[EMAIL PROTECTED]> wrote: > Hello, > > I assume you want to associate {a,b}, {c,d,e}, and {f} into sets. > > One way to do this is by associating some value with each flag and then > emitting the data associated with that value. For example, > > flag > a > b > flag > c > d > e > flag > f > > I define flag,a,b,c,d,e,f to be the key while in the Mapper context. > > Whenever the mapper sees a key, it will emit . UID is some unique > identifier associated with a certain set, and Key is the key that was passed > into the mapper. We are essentially inverting the association here. > > Let's step through this testcase. > 1. Choose UID = mapper1flag1. > 2. -> Mapper -> > 3. We have reached a flag, so we change the UID = mapper1flag2. > 4. -> Mapper -> > 5. -> Mapper -> > 6. -> Mapper -> > 7. We have reached a flag, so we change the UID = mapper1flag3. > 8. -> Mapper -> > 9. -> Mapper -> > 10. -> Mapper -> > 11. -> Mapper -> > 12. We have reached a flag, so we change the UID = mapper1flag4. > 13. -> Mapper -> > 14. EOF > > Then the reducers will collect all values with the same UID, so here is > what we get: > > 1. -> Reducer -> <{}, null> > 2. -> Reducer -> <{a,b}, null> > 3. -> Reducer -> <{c,d,e}, null> > 4. -> Reducer -> <{f}, null> > > Hopefully this solves your problem. > > On Sat, October 4, 2008 2:48 am, Zhou, Yunqing said: > > but the close() function doesn't supply me a Collector to put pairs in. > > > > Is it reasonable for me to store a reference of the collector in advance? > > > > > > I'm not sure if the collector is still available then. > > > > > > > > > > On Sat, Oct 4, 2008 at 12:17 PM, Joman Chu <[EMAIL PROTECTED]> > wrote: > > > > > >> Hello, > >> > >> Does MapReduceBase.close() fit your needs? Take a look at > >> http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/mapred > >> /MapReduceBase.html#close() > >> > >> On Fri, October 3, 2008 11:36 pm, Zhou, Yunqing said: > >>> the input is as follows. flag a b flag c d e flag f > >>> > >>> then I used a mapper to first store values and then emit them all > >>> when met with a line contains "flag" but when the file reached its > >>> end, I have no chance to emit the last record.(in this case ,f) so how > >>> can I detect > >> the > >>> mapper's end of its life , or how can I emit a last record before a > >> mapper > >>> exits. > >>> > >>> Thanks > >>> > >> > >> Have a good one, -- Joman Chu Carnegie Mellon University School of > Computer > >> Science 2011 AIM: ARcanUSNUMquam > >> > >> > > > > > -- > Joman Chu > Carnegie Mellon University > School of Computer Science 2011 > AIM: ARcanUSNUMquam > >
Re: A question about Mapper
Hello, I assume you want to associate {a,b}, {c,d,e}, and {f} into sets. One way to do this is by associating some value with each flag and then emitting the data associated with that value. For example, flag a b flag c d e flag f I define flag,a,b,c,d,e,f to be the key while in the Mapper context. Whenever the mapper sees a key, it will emit . UID is some unique identifier associated with a certain set, and Key is the key that was passed into the mapper. We are essentially inverting the association here. Let's step through this testcase. 1. Choose UID = mapper1flag1. 2. -> Mapper -> 3. We have reached a flag, so we change the UID = mapper1flag2. 4. -> Mapper -> 5. -> Mapper -> 6. -> Mapper -> 7. We have reached a flag, so we change the UID = mapper1flag3. 8. -> Mapper -> 9. -> Mapper -> 10. -> Mapper -> 11. -> Mapper -> 12. We have reached a flag, so we change the UID = mapper1flag4. 13. -> Mapper -> 14. EOF Then the reducers will collect all values with the same UID, so here is what we get: 1. -> Reducer -> <{}, null> 2. -> Reducer -> <{a,b}, null> 3. -> Reducer -> <{c,d,e}, null> 4. -> Reducer -> <{f}, null> Hopefully this solves your problem. On Sat, October 4, 2008 2:48 am, Zhou, Yunqing said: > but the close() function doesn't supply me a Collector to put pairs in. > > Is it reasonable for me to store a reference of the collector in advance? > > > I'm not sure if the collector is still available then. > > > > > On Sat, Oct 4, 2008 at 12:17 PM, Joman Chu <[EMAIL PROTECTED]> wrote: > > >> Hello, >> >> Does MapReduceBase.close() fit your needs? Take a look at >> http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/mapred >> /MapReduceBase.html#close() >> >> On Fri, October 3, 2008 11:36 pm, Zhou, Yunqing said: >>> the input is as follows. flag a b flag c d e flag f >>> >>> then I used a mapper to first store values and then emit them all >>> when met with a line contains "flag" but when the file reached its >>> end, I have no chance to emit the last record.(in this case ,f) so how >>> can I detect >> the >>> mapper's end of its life , or how can I emit a last record before a >> mapper >>> exits. >>> >>> Thanks >>> >> >> Have a good one, -- Joman Chu Carnegie Mellon University School of Computer >> Science 2011 AIM: ARcanUSNUMquam >> >> > -- Joman Chu Carnegie Mellon University School of Computer Science 2011 AIM: ARcanUSNUMquam
Re: A question about Mapper
but the close() function doesn't supply me a Collector to put pairs in. Is it reasonable for me to store a reference of the collector in advance? I'm not sure if the collector is still available then. On Sat, Oct 4, 2008 at 12:17 PM, Joman Chu <[EMAIL PROTECTED]> wrote: > Hello, > > Does MapReduceBase.close() fit your needs? Take a look at > http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/mapred/MapReduceBase.html#close() > > On Fri, October 3, 2008 11:36 pm, Zhou, Yunqing said: > > the input is as follows. flag a b flag c d e flag f > > > > then I used a mapper to first store values and then emit them all when > > met with a line contains "flag" but when the file reached its end, I have > > no chance to emit the last record.(in this case ,f) so how can I detect > the > > mapper's end of its life , or how can I emit a last record before a > mapper > > exits. > > > > Thanks > > > > Have a good one, > -- > Joman Chu > Carnegie Mellon University > School of Computer Science 2011 > AIM: ARcanUSNUMquam > >
Re: A question about Mapper
Hello, Does MapReduceBase.close() fit your needs? Take a look at http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/mapred/MapReduceBase.html#close() On Fri, October 3, 2008 11:36 pm, Zhou, Yunqing said: > the input is as follows. flag a b flag c d e flag f > > then I used a mapper to first store values and then emit them all when > met with a line contains "flag" but when the file reached its end, I have > no chance to emit the last record.(in this case ,f) so how can I detect the > mapper's end of its life , or how can I emit a last record before a mapper > exits. > > Thanks > Have a good one, -- Joman Chu Carnegie Mellon University School of Computer Science 2011 AIM: ARcanUSNUMquam