Harsh,
Appreciate you responding to the thread. Sorry got a bit held up with work,
hence couldnt reply.
I am not sure what do you mean by setup/cleanup procedures?
In my response I actually meant latter i.e. API hooks 'setup/cleanup'. If
the APIs are called per split/partition, why do we need to
Hello Anirudh,
On 02-Jan-2012, at 5:31 AM, Anirudh wrote:
> Any specific reason why setup is called for every task attempt. For
> optimization point of view, wouldnt it be good if the setup is called only
> once in case of JVM reuse.
Note that the task setup/cleanup procedures are separate fro
Thank you very much for the help.
I am going to start working on it soon (a few days) and will probably have
more questions :)
Eyal Golan
egola...@gmail.com
Visit: http://jvdrums.sourceforge.net/
LinkedIn: http://www.linkedin.com/in/egolan74
Skype: egolan74
P Save a tree. Please don't print t
Any specific reason why setup is called for every task attempt. For
optimization point of view, wouldnt it be good if the setup is called only
once in case of JVM reuse.
I have not yet looked at the implementation, in case of JVM reuse is the
application Mapper instance reused or a new instance is
You are guaranteed one setup call for every single task attempt. This
is regardless of JVM reuse being on or off. JVM reuse will cause no
issues with what Eyal is attempting to do.
On Sun, Jan 1, 2012 at 5:49 PM, Anirudh wrote:
> No problems Eyal.
>
> OnĀ a second thought, for the JVM re-use the
No problems Eyal.
On a second thought, for the JVM re-use the Mapper/Reducer instances
should be re-used, and the setup should be called only once. This makes
sense too as the JVM reuse is for the same job.
You should be good with class instantiation even if the JVM reuse is
enabled.
On Sat, Dec
Thank you very much for the detailed explanation Anirudh.
I think that my question about node / VM was due to some lack of knowledge
(I'm just starting to learn the Hadoop environment).
Regarding configuration of the nodes and clusters.
This is something that I am not doing by myself. We have a de
I just wanted to confirm where exactly you were planning to have the
instantiation code, as it was not mentioned in your previous post. The
location would have made difference. As you are doing it in the setup of
mapper/reducer, you are good.
I was referring to the Task JVM Reuse option:
http://ha
My idea is to create that class in the setup / configure method (depends
which Mapper / Reducer I will inherit from).
I don't understand the 'reuse' option you are referring to.
How many map tasks will be created? One per split or one per VM (node)?
Are you suggesting that although there would be
Where are you creating this new class. If it is in the map function, then
it will be create a new object for each record in the split.
Also you may need to see how the JVM reuse option works. I am not too sure
of this and you may want to look at the code. If the option for JVM reuse
is set, then m
Great News !!
Thanks for the info.
So using reflection, I can inject different implementations of interfaces
(services) for the mapper (or reducer).
And this way I can test a mapper (or reducer).
Just by reflecting a stub instead of a real implementation.
Thanks,
Eyal Golan
egola...@gmail.com
Eyal,
Yes, it is right to think of each Task attempt being one individual JVM running
individually on any added Node. Multiple slots would mean multiple VMs in
parallel as well. Yes, your use of reflection to build your objects will work
just fine -- its all user-side java code that is executed
Hi,
I want to understand a basic concept in MR.
If a mapper creates an instance of some class (using the 'new' operator),
then the created class exists ONCE in the VM of this node.
For each node.
Correct?
Now,
what if instead of using the 'new' operator, the class is created using
reflection.
Is
13 matches
Mail list logo