Hi Renato, BTW, thank you for pointing me to this resource. It makes sense to me. Lewis On Tue, Feb 26, 2013 at 3:56 PM, <user-digest-h...@gora.apache.org> wrote:
> Hi Lewis, > > I think this has to do with [1] which means a decision on whether or > not creating objects every time we start emitting data from the mapper > to the reducer or from the reducer to the output. For example, if we > have to create 10 million objects every time is far more expensive > than setting different values 10 million times on a single object. I > bet [1] is a better explanation of what I am trying to say here. > So the GeneratorJob generates the urls to be fetched, and the > FetcherJob actually gets all this data, is this right? If it were, > then the GeneratorJob decision makes sense, and maybe in the Fetcher > we need to keep references to the objects so that is why we don't want > to use a single one. > Anyways, I am just guessing on this last part, not sure if that is > actually how it happens. I will look at the code tomorrow just to be > sure. Hope it helps. > > -- *Lewis*