Re: What's the basic idea of pseudo-distributed Hadoop ?

2012-09-14 Thread Kai Voigt
Hello. Am 14.09.2012 um 08:03 schrieb Jason Yang lin.yang.ja...@gmail.com: I have a question about how does the pseudo-distributed Hadoop cluster work: As many map tasks are submitted to the pseudo-distributed Hadoop cluster, does the hadoop run each mapper in sequence ? or does it run

Re: What's the basic idea of pseudo-distributed Hadoop ?

2012-09-14 Thread Jason Yang
Hey, Kai Thanks for you reply. I was wondering what's difference btw the pseudo-distributed and fully-distributed hadoop, except the maximum number of map/reduce. And if a MR program works fine in pseudo-distributed cluster, will it work exactly fine in the fully-distributed cluster ?

Re: What's the basic idea of pseudo-distributed Hadoop ?

2012-09-14 Thread Harsh J
Hi Jason, I think you're confusing the standalone mode with a pseudo-distributed mode. The former is a limited mode of MR where no daemons need to be deployed and the tasks run in a single JVM (via threads). A pseudo distributed cluster is a cluster where all daemons are running on one node

Re: What's the basic idea of pseudo-distributed Hadoop ?

2012-09-14 Thread Bertrand Dechoux
The only difference between pseudo-distributed and fully distributed would be scale. You could say that code that runs fine on the former, runs fine too on the latter. But it does not necessary mean that the performance will scale the same way (ie if you keep a list of elements in memory, at

Re: What's the basic idea of pseudo-distributed Hadoop ?

2012-09-14 Thread Jason Yang
All right, I got it. Thanks for all of you. 2012/9/14 Bertrand Dechoux decho...@gmail.com The only difference between pseudo-distributed and fully distributed would be scale. You could say that code that runs fine on the former, runs fine too on the latter. But it does not necessary mean

Re: What's the basic idea of pseudo-distributed Hadoop ?

2012-09-14 Thread Hemanth Yamijala
One thing to be careful about is paths of dependent libraries or executables like streaming binaries. In pseudo distributed mode, since all processes are looking on the same machine, it is likely that they will find paths that are really local to only the machine where the job is being launched