I end up with using my own MapRunner, so that I can control the call to map function, and then calling close() is not necessary. However I think it is reasonable to have close() throw IOException, but providing OutputCollector may make the framework a little messy, my suggestion is stay with the close() function, and provide an customized MapRunner that meets our need.
On Thu, Aug 21, 2008 at 3:51 AM, Chris Dyer <[EMAIL PROTECTED]> wrote: > Qin's question actually raises an issue-- it seems that using a > close() call, which does not throw IOException and which does not > provide the user with access to the OutputCollector object makes this > important piece of functionality (from a client's perspective) hard to > use. Does anyone feel strongly about altering the contract so that > close() throws IOException and provides the implementer with the > OutputCollector object? > > On Wed, Aug 20, 2008 at 1:43 PM, Qin Gao <[EMAIL PROTECTED]> wrote: > > Thanks Chris, that's exactly what I am trying to do. It solves my > problem. > > > > On Wed, Aug 20, 2008 at 4:36 PM, Chris Dyer <[EMAIL PROTECTED]> wrote: > > > >> Qin, since I can guess what you're trying to do with this (emit a > >> bunch of expected counts at the end of EM?), you can write output > >> during the call to close(). It involves having to store the output > >> collector object as a member of the class, but this is a way to do a > >> final flush on the object before it is destroyed. > >> > >> Chris > >> > >> On Wed, Aug 20, 2008 at 7:02 PM, Qin Gao <[EMAIL PROTECTED]> wrote: > >> > Hi mailing, > >> > > >> > Are there any way to know whether the mapper is processing the last > >> record > >> > that assigned to this node, or know how many records remain to be > >> processed > >> > in this node? > >> > > >> > > >> > Qin > >> > > >> > > >